Error running strainphlan

Hi there

I’m trying to follow the strainphlan tutorial.

When I run the following: strainphlan -s consensus/.pkl -m clade_markers/s **Eubacterium_rectale.fna -d /home/eplummer/zb84/databases/metaphaln_database/mpa_v30_CHOCOPhlAn_201901.pkl -r reference_genomes/.fna -o output -c s** Eubacterium_rectale

I get the following output and error:

Mon Aug 10 08:54:57 2020: Start StrainPhlAn 3.0 execution
Mon Aug 10 08:54:57 2020: Creating temporary directory…
Mon Aug 10 08:54:57 2020: Done.
Mon Aug 10 08:54:57 2020: Getting markers from main sample files…
Mon Aug 10 08:54:58 2020: Done.
Mon Aug 10 08:54:58 2020: Getting markers from main reference files…Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended

Mon Aug 10 08:54:59 2020: Done.
Mon Aug 10 08:54:59 2020: Removing bad markers / samples…
Mon Aug 10 08:54:59 2020: Done.
Mon Aug 10 08:54:59 2020: Writing samples as markers’ FASTA files…
Mon Aug 10 08:55:00 2020: Done.
Mon Aug 10 08:55:00 2020: Writing filtered clade markers as FASTA file…
Mon Aug 10 08:55:00 2020: Done.
Mon Aug 10 08:55:00 2020: Calculating polymorphic rates…
Mon Aug 10 08:55:00 2020: Done.
Mon Aug 10 08:55:00 2020: Executing PhyloPhlAn 3.0…
Mon Aug 10 08:55:00 2020: Creating PhyloPhlAn 3.0 database…Traceback (most recent call last):
File “/usr/local/metaphlan/3.0/bin/strainphlan”, line 11, in
load_entry_point(‘MetaPhlAn==3.0.0a1’, ‘console_scripts’, ‘strainphlan’)()
File “/usr/local/metaphlan/3.0/lib/python3.7/site-packages/metaphlan/strainphlan.py”, line 830, in main
args.mutation_rates, args.print_clades_only, args.nprocs)
File “/usr/local/metaphlan/3.0/lib/python3.7/site-packages/metaphlan/strainphlan.py”, line 784, in strainphlan
mutation_rates, nprocs)
File “/usr/local/metaphlan/3.0/lib/python3.7/site-packages/metaphlan/strainphlan.py”, line 545, in compute_phylogeny
create_phylophlan_db(tmp_dir, clade)
File “/usr/local/metaphlan/3.0/lib/python3.7/site-packages/metaphlan/utils/external_exec.py”, line 100, in create_phylophlan_db
execute(compose_command(params, input_file=markers))
File “/usr/local/metaphlan/3.0/lib/python3.7/site-packages/metaphlan/utils/external_exec.py”, line 30, in execute
exec_res = sb.run(cmd[‘command_line’], stdin=inp_f, stdout=out_f)
File “/usr/local/python/3.7.3-system/lib/python3.7/subprocess.py”, line 472, in run
with Popen(*popenargs, **kwargs) as process:
File “/usr/local/python/3.7.3-system/lib/python3.7/subprocess.py”, line 775, in init
restore_signals, start_new_session)
File “/usr/local/python/3.7.3-system/lib/python3.7/subprocess.py”, line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: ‘phylophlan_setup_database’: ‘phylophlan_setup_database’

Any advice on how to rectify this error and what is the phylophlan_setup_database?

Thank you
Erica

Hi Erica,
The problem seems related with the installation. It looks like StrainPhlAn is not able to find the PhyloPhlAn installation (phylophlan_setup_database is one of the PhyloPhlAn scripts). How did you install MetaPhlAn? Was it via conda?

Best,
Aitor

Thanks Aitor. The admins of our HPC installed it but it was not via conda. I’ll pass this information on to them and be in touch if we need more information.

Thank you
Erica

Hey @aitor.blancomiguez,

I receive a very similar error.
MetaPhlAn version 3.0.2 (23 Jul 2020)
PhyloPhlAn version 3.0.51 (11 May 2020)
Installed via Conda on a local machine.

`$ strainphlan -s consensus_markers/.pkl -m clade_markers/s__Micrococcus_luteus.fna -r ref_genome/M._luteus/.fna -o output -c s__Micrococcus_luteus --nproc 12
Thu Aug 20 15:44:14 2020: Start StrainPhlAn 3.0 execution
Thu Aug 20 15:44:14 2020: Creating temporary directory…
Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Getting markers from main sample files…
Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Getting markers from main reference files…Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended

Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Removing bad markers / samples…
Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Writing samples as markers’ FASTA files…
Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Writing filtered clade markers as FASTA file…
Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Calculating polymorphic rates…
Thu Aug 20 15:44:14 2020: Done.
Thu Aug 20 15:44:14 2020: Executing PhyloPhlAn 3.0…
Thu Aug 20 15:44:14 2020: Creating PhyloPhlAn 3.0 database…
Thu Aug 20 15:44:15 2020: Done.
Thu Aug 20 15:44:15 2020: Generating PhyloPhlAn 3.0 configuration file…
Thu Aug 20 15:44:15 2020: Done.
Thu Aug 20 15:44:15 2020: Processing samples…[e] “/home/plicht/anaconda3/envs/metaphlan3_env/lib/python3.7/site-packages/phylophlan/phylophlan_configs/” folder does not exists
[e] “db_dna” database “output/tmpnt47_zbj/s__Micrococcus_luteus” (.nhr, .nin, .nog, .nsd, .nsi, .nsq) has not been created… something went wrong!

[e] An error was ocurred executing a external tool, exiting…
Thu Aug 20 15:44:15 2020: Stop StrainPhlAn 3.0 execution.`

what is the phylophlan configs folder? Also output/tmpnt47_zbj/s__Micrococcus_luteus does exist:

`tree output/tmpnt47_zbj

output/tmpnt47_zbj
├── blastn
│ ├── GCA_000023205.1_ASM2320v1_genomic.ndb
│ ├── GCA_000023205.1_ASM2320v1_genomic.nhr
│ ├── GCA_000023205.1_ASM2320v1_genomic.nin
│ ├── GCA_000023205.1_ASM2320v1_genomic.nog
│ ├── GCA_000023205.1_ASM2320v1_genomic.nos
│ ├── GCA_000023205.1_ASM2320v1_genomic.not
│ ├── GCA_000023205.1_ASM2320v1_genomic.nsq
│ ├── GCA_000023205.1_ASM2320v1_genomic.ntf
│ ├── GCA_000023205.1_ASM2320v1_genomic.nto
│ ├── GCA_000023205.blastn
│ ├── GCA_000180435.1_ASM18043v1_genomic.ndb
│ ├── GCA_000180435.1_ASM18043v1_genomic.nhr
│ ├── GCA_000180435.1_ASM18043v1_genomic.nin
│ ├── GCA_000180435.1_ASM18043v1_genomic.nog
│ ├── GCA_000180435.1_ASM18043v1_genomic.nos
│ ├── GCA_000180435.1_ASM18043v1_genomic.not
│ ├── GCA_000180435.1_ASM18043v1_genomic.nsq
│ ├── GCA_000180435.1_ASM18043v1_genomic.ntf
│ ├── GCA_000180435.1_ASM18043v1_genomic.nto
│ ├── GCA_000180435.blastn
│ ├── GCA_007667915.1_ASM766791v1_genomic.ndb
│ ├── GCA_007667915.1_ASM766791v1_genomic.nhr
│ ├── GCA_007667915.1_ASM766791v1_genomic.nin
│ ├── GCA_007667915.1_ASM766791v1_genomic.nog
│ ├── GCA_007667915.1_ASM766791v1_genomic.nos
│ ├── GCA_007667915.1_ASM766791v1_genomic.not
│ ├── GCA_007667915.1_ASM766791v1_genomic.nsq
│ ├── GCA_007667915.1_ASM766791v1_genomic.ntf
│ ├── GCA_007667915.1_ASM766791v1_genomic.nto
│ ├── GCA_007667915.blastn
│ ├── GCA_007677545.1_ASM767754v1_genomic.ndb
│ ├── GCA_007677545.1_ASM767754v1_genomic.nhr
│ ├── GCA_007677545.1_ASM767754v1_genomic.nin
│ ├── GCA_007677545.1_ASM767754v1_genomic.nog
│ ├── GCA_007677545.1_ASM767754v1_genomic.nos
│ ├── GCA_007677545.1_ASM767754v1_genomic.not
│ ├── GCA_007677545.1_ASM767754v1_genomic.nsq
│ ├── GCA_007677545.1_ASM767754v1_genomic.ntf
│ ├── GCA_007677545.1_ASM767754v1_genomic.nto
│ ├── GCA_007677545.blastn
│ ├── GCA_008868275.1_ASM886827v1_genomic.ndb
│ ├── GCA_008868275.1_ASM886827v1_genomic.nhr
│ ├── GCA_008868275.1_ASM886827v1_genomic.nin
│ ├── GCA_008868275.1_ASM886827v1_genomic.nog
│ ├── GCA_008868275.1_ASM886827v1_genomic.nos
│ ├── GCA_008868275.1_ASM886827v1_genomic.not
│ ├── GCA_008868275.1_ASM886827v1_genomic.nsq
│ ├── GCA_008868275.1_ASM886827v1_genomic.ntf
│ ├── GCA_008868275.1_ASM886827v1_genomic.nto
│ ├── GCA_008868275.blastn
│ ├── GCA_900475555.1_44257_B01_genomic.ndb
│ ├── GCA_900475555.1_44257_B01_genomic.nhr
│ ├── GCA_900475555.1_44257_B01_genomic.nin
│ ├── GCA_900475555.1_44257_B01_genomic.nog
│ ├── GCA_900475555.1_44257_B01_genomic.nos
│ ├── GCA_900475555.1_44257_B01_genomic.not
│ ├── GCA_900475555.1_44257_B01_genomic.nsq
│ ├── GCA_900475555.1_44257_B01_genomic.ntf
│ ├── GCA_900475555.1_44257_B01_genomic.nto
│ └── GCA_900475555.blastn
├── phylophlan.cfg
├── s__Micrococcus_luteus
│ ├── 100311807305.fna
│ ├── 117883585172.fna
│ ├── 118925746802.fna
│ ├── 130856499650.fna
│ ├── 140829515937.fna
│ ├── 142499220393.fna
│ ├── 144257367218.fna
│ ├── 1450814.fna
│ ├── 155574897606.fna
│ ├── 159693394707.fna
│ ├── 161879808796.fna
│ ├── 165619095727.fna
│ ├── 166691944550.fna
│ ├── 168997966351.fna
│ ├── 170041957681.fna
│ ├── 172190202432.fna
│ ├── 17281917954.fna
│ ├── 179686009804.fna
│ ├── 184889909743.fna
│ ├── 186038583274.fna
│ ├── 206043501212.fna
│ ├── 215814139248.fna
│ ├── 22792670732.fna
│ ├── 24912802435.fna
│ ├── 253993184868.fna
│ ├── 258455309875.fna
│ ├── 279477456326.fna
│ ├── 283619537870.fna
│ ├── 287641439144.fna
│ ├── 308031623334.fna
│ ├── 317818220331.fna
│ ├── 319965251958.fna
│ ├── 321693081143.fna
│ ├── 322470092431.fna
│ ├── 328710049789.fna
│ ├── 329807411074.fna
│ ├── 332600138613.fna
│ ├── 342386102258.fna
│ ├── 345105603408.fna
│ ├── 362553127155.fna
│ ├── 36274363970.fna
│ ├── 36433278786.fna
│ ├── 365561488618.fna
│ ├── 368721706974.fna
│ ├── 369560601334.fna
│ ├── 380876508364.fna
│ ├── 383684403428.fna
│ ├── 411787581640.fna
│ ├── 413643518992.fna
│ ├── 417127773223.fna
│ ├── 418580967373.fna
│ ├── 425836416165.fna
│ ├── 432726744544.fna
│ ├── 435527635483.fna
│ ├── 440402491456.fna
│ ├── 448637683226.fna
│ ├── 486335361984.fna
│ ├── 488448468085.fna
│ ├── 500694344442.fna
│ ├── 500728455361.fna
│ ├── 505707386726.fna
│ ├── 507283500188.fna
│ ├── 510806579005.fna
│ ├── 516456111318.fna
│ ├── 519804999297.fna
│ ├── 526458829734.fna
│ ├── 526461381005.fna
│ ├── 528251218146.fna
│ ├── 532586227565.fna
│ ├── 539640842148.fna
│ ├── 54022996605.fna
│ ├── 544131290568.fna
│ ├── 546507110265.fna
│ ├── 551356288436.fna
│ ├── 579099309444.fna
│ ├── 581829956937.fna
│ ├── 589659074295.fna
│ ├── 589832204203.fna
│ ├── 590329737684.fna
│ ├── 591861483579.fna
│ ├── 59925594844.fna
│ ├── 619795520299.fna
│ ├── 620981443556.fna
│ ├── 623129288390.fna
│ ├── 646300375546.fna
│ ├── 653569025469.fna
│ ├── 658877203961.fna
│ ├── 674159341488.fna
│ ├── 692500507142.fna
│ ├── 712229431031.fna
│ ├── 712509735998.fna
│ ├── 713288180537.fna
│ ├── 718565030564.fna
│ ├── 739493688037.fna
│ ├── 740319947268.fna
│ ├── 753546184324.fna
│ ├── 764448110202.fna
│ ├── 774083973776.fna
│ ├── 802150176742.fna
│ ├── 803784286652.fna
│ ├── 804686096753.fna
│ ├── 832722354439.fna
│ ├── 833320146036.fna
│ ├── 835702885088.fna
│ ├── 847856156681.fna
│ ├── 84941443206.fna
│ ├── 853010941524.fna
│ ├── 857654927554.fna
│ ├── 865473777867.fna
│ ├── 874689981157.fna
│ ├── 877400250846.fna
│ ├── 880909332869.fna
│ ├── 883533643458.fna
│ ├── 891278300619.fna
│ ├── 895903499819.fna
│ ├── 897488641646.fna
│ ├── 905732796509.fna
│ ├── 909078153548.fna
│ ├── 91007625845.fna
│ ├── 913075979327.fna
│ ├── 917196918466.fna
│ ├── 918888752949.fna
│ ├── 919525556619.fna
│ ├── 921856997489.fna
│ ├── 923093033556.fna
│ ├── 944720522555.fna
│ ├── 948022495235.fna
│ ├── 948244728547.fna
│ ├── 951509169675.fna
│ ├── 95339075396.fna
│ ├── 970558968244.fna
│ ├── 98008026086.fna
│ ├── s__Micrococcus_luteus.fna
│ ├── s__Micrococcus_luteus.ndb
│ ├── s__Micrococcus_luteus.nhr
│ ├── s__Micrococcus_luteus.nin
│ ├── s__Micrococcus_luteus.nog
│ ├── s__Micrococcus_luteus.nos
│ ├── s__Micrococcus_luteus.not
│ ├── s__Micrococcus_luteus.nsq
│ ├── s__Micrococcus_luteus.ntf
│ └── s__Micrococcus_luteus.nto
└── s__Micrococcus_luteus.StrainPhlAn3
├── GCA_000023205.1_ASM2320v1_genomic.fna
├── GCA_000180435.1_ASM18043v1_genomic.fna
├── GCA_007667915.1_ASM766791v1_genomic.fna
├── GCA_007677545.1_ASM767754v1_genomic.fna
├── GCA_008868275.1_ASM886827v1_genomic.fna
├── GCA_900475555.1_44257_B01_genomic.fna
├── sample1.fna
└── sample2.fna`

Hi @plicht ,
The error reported about the phylophlan folder can be fixed just creating that folder, you could run:
$ mkdir /home/plicht/anaconda3/envs/metaphlan3_env/lib/python3.7/site-packages/phylophlan/phylophlan_configs/
About the BLAST database, from the tree command it looks like the database has been created, but could you check the size of those files: output/tmpnt47_zbj/s__Micrococcus_luteus.*

Best,
Aitor

Hey @aitor.blancomiguez,

I retried to run strainphlan with the created phylophlan_configs dir but still getting an error:

$ strainphlan -s consensus_markers/.pkl -m clade_markers/s__Micrococcus_luteus.fna -r ref_genome/M._luteus/.fna -o output -c s__Micrococcus_luteus --phylophlan_mode accurate --nproc 12
Tue Aug 25 16:05:55 2020: Start StrainPhlAn 3.0 execution
Tue Aug 25 16:05:55 2020: Creating temporary directory…
Tue Aug 25 16:05:55 2020: Done.
Tue Aug 25 16:05:55 2020: Getting markers from main sample files…
Tue Aug 25 16:05:55 2020: Done.
Tue Aug 25 16:05:55 2020: Getting markers from main reference files…Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended
Warning: [blastn] Examining 5 or more matches is recommended

Tue Aug 25 16:05:59 2020: Done.
Tue Aug 25 16:05:59 2020: Removing bad markers / samples…
Tue Aug 25 16:05:59 2020: Done.
Tue Aug 25 16:05:59 2020: Writing samples as markers’ FASTA files…
Tue Aug 25 16:06:00 2020: Done.
Tue Aug 25 16:06:00 2020: Writing filtered clade markers as FASTA file…
Tue Aug 25 16:06:00 2020: Done.
Tue Aug 25 16:06:00 2020: Calculating polymorphic rates…
Tue Aug 25 16:06:00 2020: Done.
Tue Aug 25 16:06:00 2020: Executing PhyloPhlAn 3.0…
Tue Aug 25 16:06:00 2020: Creating PhyloPhlAn 3.0 database…
Tue Aug 25 16:06:00 2020: Done.
Tue Aug 25 16:06:00 2020: Generating PhyloPhlAn 3.0 configuration file…
Tue Aug 25 16:06:00 2020: Done.
Tue Aug 25 16:06:00 2020: Processing samples…[e] “db_dna” database “output/tmpud9yv0kd/s__Micrococcus_luteus” (.nhr, .nin, .nog, .nsd, .nsi, .nsq) has not been created… something went wrong!

[e] An error was ocurred executing a external tool, exiting…
Tue Aug 25 16:06:01 2020: Stop StrainPhlAn 3.0 execution.

From my last post ( output/tmpnt47_zbj/), these are the Micrococcus_luteus.* files

s__Micrococcus_luteus.fna 135,9 kb
s__Micrococcus_luteus.ndb 49,2 kb
s__Micrococcus_luteus.nhr 9,9 kb
s__Micrococcus_luteus.nin 1,7 kb
s__Micrococcus_luteus.nog 560 bytes
s__Micrococcus_luteus.nos 5,9 kb
s__Micrococcus_luteus.not 1,6kb
s__Micrococcus_luteus.nsq 32,3 kb
s__Micrococcus_luteus.ntf 16,4 kb
s__Micrococcus_luteus.nto 532 bytes

This run (output/tmpud9yv0kd) the files have the same sizes.

Hi @plicht, sorry there was a check in PhyloPhlAn that assumed that only certain files should be produced by makeblastdb. In your case, the database file created are more and have different extensions (might be due to a different blast version).

I updated the PhyloPhlAn code and I’m in the process of having a new package in Bioconda soon. As soon as that is done, @aitor.blancomiguez will update StrainPhlAn to pull the new PhyloPhlAn and that should fix your problem.

Many thanks,
Francesco

Hi @plicht, we just uploaded a new MetaPhlAn conda package that should fix your problem. Please, make sure that you install the MetaPhlAn package version 3.0.3 (that should install PhyloPhlAn version 3.0.54).

Best,
Aitor

Dear authors,

I seem to have the same issue as the poster above when executing the tutorial:

Blockquote
strainphlan -s consensus_markers/*.pkl -m db_markers/s__Bacteroides_caccae.fna -r reference_genomes/G000273725.fna.bz2 -o output -n 8 -c s__Bacteroides_caccae --phylophlan_mode accurate --mutation_rates
Wed Feb 10 10:01:02 2021: Start StrainPhlAn 3.0 execution
Wed Feb 10 10:01:02 2021: Creating temporary directory…
Wed Feb 10 10:01:02 2021: Done.
Wed Feb 10 10:01:02 2021: Getting markers from main sample files…
Wed Feb 10 10:01:02 2021: Done.
Wed Feb 10 10:01:02 2021: Getting markers from main reference files…Warning: [blastn] Examining 5 or more matches is recommended
Wed Feb 10 10:01:05 2021: Done.
Wed Feb 10 10:01:05 2021: Removing bad markers / samples…
Wed Feb 10 10:01:05 2021: Done.
Wed Feb 10 10:01:05 2021: Writing samples as markers’ FASTA files…
Wed Feb 10 10:01:05 2021: Done.
Wed Feb 10 10:01:05 2021: Writing filtered clade markers as FASTA file…
Wed Feb 10 10:01:05 2021: Done.
Wed Feb 10 10:01:05 2021: Calculating polymorphic rates…
Wed Feb 10 10:01:06 2021: Done.
Wed Feb 10 10:01:06 2021: Executing PhyloPhlAn 3.0…
Wed Feb 10 10:01:06 2021: Creating PhyloPhlAn 3.0 database…
Wed Feb 10 10:01:06 2021: Done.
Wed Feb 10 10:01:06 2021: Generating PhyloPhlAn 3.0 configuration file…
Wed Feb 10 10:01:07 2021: Done.
Wed Feb 10 10:01:07 2021: Processing samples…[e] database “db_dna” (output/tmp/s__Bacteroides_caccae) has not been created… something went wrong!
[e] An error was ocurred executing a external tool, exiting…
Wed Feb 10 10:01:12 2021: Stop StrainPhlAn 3.0 execution.

I have the following versions installed, of which MetaPhlAn is later than 3.0.3
python 3.7.9 hffdb5ce_0_cpython conda-forge
phylophlan 3.0.2 py_0 bioconda
metaphlan 3.0.7 pyh7b7c402_0 bioconda
biopython 1.78 py37h5e8e339_1 conda-forge

Do you have any advice on the topic?

Dear authors, I managed to resolve the issue by using pip install within a venv.

(It seems like my version of conda had trouble updating to the newest version of phylophlan, even though conda list showed this was the case. )