Problem running StrainPhlan

Dear Help

I have been running Strainphlan for the first time and followed the tutorial and manual. The first steps were successful until when I calling strainphlan to build the multiple sequence alignment and the phylogenetic tree by executing Phylophlan.

I have reinstalled Strainphlan and used the latest version. Before this version I tried with the version I installed in May 2020. The older version finalised this step, but couldn’t remove tmp file, so I decided to update Strainphlan and try again but it the newer version didn’t work though this step.

Here is my script ( I didn’t include yet any reference genome for Cutibacterium acnes):

strainphlan -s sample_markers/all/*.pkl -m strains_db_markers/s__Cutibacterium_acnes.fna -o strains_tree_Cacnes -n 4 -c s__Cutibacterium_acnes --marker_in_n_samples 40

See below the error message I got:

Tue Apr 6 16:30:11 2021: Start StrainPhlAn 3.0 execution
Tue Apr 6 16:30:11 2021: Creating temporary directory…
Tue Apr 6 16:30:11 2021: Done.
Tue Apr 6 16:30:11 2021: Getting markers from main sample files…
Tue Apr 6 16:30:14 2021: Done.
Tue Apr 6 16:30:14 2021: Getting markers from main reference files…
Tue Apr 6 16:30:15 2021: Done.
Tue Apr 6 16:30:15 2021: Removing bad markers / samples…
Tue Apr 6 16:30:15 2021: Done.
Tue Apr 6 16:30:15 2021: Writing samples as markers’ FASTA files…
Tue Apr 6 16:30:15 2021: Done.
Tue Apr 6 16:30:15 2021: Writing filtered clade markers as FASTA file…
Tue Apr 6 16:30:15 2021: Done.
Tue Apr 6 16:30:15 2021: Calculating polymorphic rates…
Tue Apr 6 16:30:16 2021: Done.
Tue Apr 6 16:30:16 2021: Executing PhyloPhlAn 3.0…
Tue Apr 6 16:30:16 2021: Creating PhyloPhlAn 3.0 database…
Tue Apr 6 16:30:16 2021: Done.
Tue Apr 6 16:30:16 2021: Generating PhyloPhlAn 3.0 configuration file…
Tue Apr 6 16:30:16 2021: Done.
Tue Apr 6 16:30:16 2021: Processing samples…
[e] expected str, bytes or os.PathLike object, not NoneType

[e] gene_markers_selection crashed

[e] An error was ocurred executing a external tool, exiting…
Tue Apr 6 16:30:50 2021: Stop StrainPhlAn 3.0 execution.

Big thanks if you can Help me!
Br, Hanna Sinkko

Hi @sinkko
Thanks for getting in touch. Could you provide me some additional info to better address the issue?

  • Which was the installation method, conda?
  • MetaPhlAn version: you can run metaphlan --version
  • PhyloPhlAn version: you can run phylophlan --version
  • The content of the temporal folder created in the output directory

Thanks,
Aitor

Hi Aitor and thanks for help:

I installed with conda with the following commands:

module load bioconda/2
conda create -n metaphlan
conda activate metaphlan
conda install -c bioconda metaphlan

I’m running MetaPhlAn version 3.0.7 (09 Dec 2020) and it is working just fine.

PhyloPhlAn version is older 3.0.51 (11 May 2020)

Maybe the problem is that? Although I think I also updated phylophlan but seems that the version is older than the newest 3.0.2.

The output directory includes these:

24K -rw-rw----. 1 sinkko project_2001318 24K Apr 8 17:52 s__Cutibacterium_acnes.polymorphic
4.0K drwxrws—. 5 sinkko project_2001318 4.0K Apr 8 17:52 tmp

And the tmp directory includes these:
4.0K drwxrws—. 2 sinkko project_2001318 4.0K Apr 8 17:52 blastn
12K drwxrws—. 2 sinkko project_2001318 12K Apr 8 17:52 s__Cutibacterium_acnes
4.0K drwxrws—. 2 sinkko project_2001318 4.0K Apr 8 17:52 s__Cutibacterium_acnes.StrainPhlAn3

blastn folder is empty but in the s__Cutibacterium - folders there are several .fna files.

Hi @sinkko
Yes, it could actually be a problem with the old version of PhyloPhlAn, we got reported some problems with the BLAST execution in older versions (that seems the stage your execution got stuck) and conda seems to have problems sometimes to retrieve the last version when installing metaphlan. I would try to install the last version of PhyloPhlAn: conda install -c bioconda phylophlan and try to execute it again. Let me know if this fixes your issue.

Thanks

Hi Aitor,

That helped, thanks! My next question is that as far as I have understood there are reference genomes used in the MetaPhlAn database. Can I use those for strainphlan? I didn’t add the reference genome yet but I would like to, I just have hard time to find them. So where can I find them? Or should I use some public databases to retrieve a reference genome?

Br, Hanna

Hi @sinkko
For building the MetaPhlAn markers database we downloaded all reference genomes available through UniProt Proteomes and linked to the public DDBJ, ENA, and GenBank repositories. However since they are already available in those public repositories we didn’t make them available to download from our servers. I would suggest you to download them directly from GenBank, e.g: Cutibacterium acnes - Assembly - NCBI

Best,
Aitor