I successfully completed the sample2markers.py step, however I cannot run step 3: Identify clades detected in the samples and build reference databases. The command i am using is:
Traceback (most recent call last):
File “/mnt/nfs/home/30041036/.conda/envs/metaphlan2/bin/strainphlan.py”, line 1585, in
strainphlan()
File “/mnt/nfs/home/30041036/.conda/envs/metaphlan2/bin/strainphlan.py”, line 1581, in strainphlan
strainer(args)
File “/mnt/nfs/home/30041036/.conda/envs/metaphlan2/bin/strainphlan.py”, line 1365, in strainer
db = pickle.load(bz2.BZ2File(args[‘mpa_pkl’]))
File “/mnt/nfs/home/30041036/.conda/envs/metaphlan2/lib/python3.7/bz2.py”, line 92, in init
self._fp = _builtin_open(filename, mode)
IsADirectoryError: [Errno 21] Is a directory: ‘/mnt/nfs/home/30041036/.conda/envs/metaphlan2/bin/metaphlan_databases’
How can I fix this? I can’t see that anyone else has had similar issues. Also in step 4 --ifn_markers s__Eubacterium_siraeum.markers.fasta is used in the command, how do I generate this fasta file for the species I am interested in (e.g. staphylococcus aureus)?
Hi mradz19,
Try to add the param –index v296_CHOCOPhlAn_201901 to the strainphlan execution like:
strainphlan.py --ifn_samples p100_bowtie2_aligned.markers --output_dir markers/ --print_clades_only --index v296_CHOCOPhlAn_201901 > clades.txt
Hi Michael,
When StrainPhlAn is not able to return any clade could be due two main reasons:
The database version you used for create the SAM file is different than the version you used for executing StrainPhlAn. This can be checked taking a look on the first line of the abundances report file generated together with the SAM file.
The sample2markers script was not able to reconstruct enough markers for your sample.
If you could share your markers file I could take a deeper look on the problem.
Hi @lzh1982
The StrainPhlAn markers’ database is same as the MetaPhlAn markers’ database.
If you installed MetaPhlAn 3 via conda, StrainPhlAn and the markers’ database will be also downloaded and installed, please check the tutorial for more info: https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0#installation
If you have any issue with the conda installatioiin, you can also download the database from the following links:
Hi Aitor,
Thank you very much for your explanation! I have installed biobakery_workflow through docker. I need know how to download the reference database not only maker database and install STRAINPHLAN_DB_REFERENCE manually? Would you help me?
Hi,
You should set the the environmental variables STRAINPHLAN_DB_REFERENCE and STRAINPHLAN_DB_MAKERS inside your Docker instance using export.
STRAINPHLAN_DB_REFERENCE should point to the folder containing the reference genomes used when running StrainPhlAn and STRAINPHLAN_DB_MARKERS points to the folder containing the StrainPhlAn marker files.
Dear Dr.Francesco.Beghini,
Thank you very much for your explanation! I do not know where I can download the STRAINPHLAN_DB_REFERENCE database. Because I can not install directly using the order:biobakery_workflows --install wmgx, so I want to download the corresponding datatbase and install manually! Many thanks!