Error occuring at Phylophlan execution while running Strainphlan 4

Greetings!

I am using StrainPhlAn version 4.0.6 (1 Mar 2023) for strains diversity profilling. I am following the (StrainPhlAn 4 · biobakery/MetaPhlAn Wiki · GitHub) tutorial and had completed the tutorial till step 4 but the error is occuring at step 5 and I am really stucked at this error for a long time. The error is as followed:

(metaphlan) u01@ngsa:~/Preprocess/Taxonomy_2$ bash metaphlan_step4.sh
Sun Jan 7 16:39:15 2024: Start StrainPhlAn 4.0.6 execution
Sun Jan 7 16:39:15 2024: Creating temporary directory…
Sun Jan 7 16:39:15 2024: Done.
Sun Jan 7 16:39:15 2024: Filtering markers and samples…
Sun Jan 7 16:39:15 2024: Getting markers from main samples…
Sun Jan 7 16:39:15 2024: Done.
Sun Jan 7 16:39:15 2024: Getting markers from main references…
Warning: [blastn] Examining 5 or more matches is recommended
Sun Jan 7 16:39:16 2024: Done.
Sun Jan 7 16:39:16 2024: Removing bad markers / samples…
Sun Jan 7 16:39:16 2024: Done.
Sun Jan 7 16:39:16 2024: Getting markers from secondary samples and references…
Sun Jan 7 16:39:16 2024: Done.
Sun Jan 7 16:39:16 2024: Done.
Sun Jan 7 16:39:16 2024: Writing samples as markers’ FASTA files…
Sun Jan 7 16:39:17 2024: Done.
Sun Jan 7 16:39:17 2024: Writing filtered clade markers as FASTA file…
Sun Jan 7 16:39:17 2024: Done.
Sun Jan 7 16:39:17 2024: Calculating polymorphic rates…
Sun Jan 7 16:39:17 2024: Done.
Sun Jan 7 16:39:17 2024: Executing PhyloPhlAn…
Sun Jan 7 16:39:17 2024: Creating PhyloPhlAn database…
Sun Jan 7 16:39:17 2024: Done.
Sun Jan 7 16:39:17 2024: Generating PhyloPhlAn configuration file…
Sun Jan 7 16:39:18 2024: Done.
Sun Jan 7 16:39:18 2024: Processing samples…

[e] Command ‘[’/home/u01/mambaforge/envs/metaphlan/bin/mafft’, ‘–quiet’, ‘–anysymbol’, ‘–thread’, ‘1’, ‘–auto’, ‘output/tmpw6i4h70o/markers/649910227576.fna’]’ returned non-zero exit status 1.

[e] error while aligning
command_line: /home/u01/mambaforge/envs/metaphlan/bin/mafft --quiet --anysymbol --thread 1 --auto output/tmpw6i4h70o/markers/649910227576.fna
stdin: None
stdout: /home/u01/Preprocess/Taxonomy_2/output/tmpw6i4h70o/msas/649910227576.aln
env: {‘LESSOPEN’: ‘| /usr/bin/lesspipe %s’, ‘CONDA_PROMPT_MODIFIER’: '(metaphlan) ', ‘MAIL’: ‘/var/mail/u01’, ‘USER’: ‘u01’, ‘SSH_CLIENT’: ‘10.7.41.3 59826 22’, ‘LC_TIME’: ‘ur_PK’, ‘SHLVL’: ‘2’, ‘CONDA_SHLVL’: ‘2’, ‘OLDPWD’: ‘/home/u01/Preprocess/Taxonomy_2/reference_genomes’, ‘HOME’: ‘/home/u01’, ‘SSH_TTY’: ‘/dev/pts/2’, ‘LC_MONETARY’: ‘ur_PK’, ‘DBUS_SESSION_BUS_ADDRESS’: ‘unix:path=/run/user/1001/bus’, ‘CE_M’: ‘’, ‘LOGNAME’: ‘u01’, '’: ‘/home/u01/mambaforge/envs/metaphlan/bin/strainphlan’, ‘XDG_SESSION_ID’: ‘5978’, ‘TERM’: ‘xterm’, ‘_CE_CONDA’: ‘’, ‘PATH’: ‘/home/u01/mambaforge/envs/metaphlan/bin:/home/u01/mambaforge/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin’, ‘LC_ADDRESS’: ‘ur_PK’, ‘XDG_RUNTIME_DIR’: ‘/run/user/1001’, ‘DISPLAY’: ‘localhost:11.0’, ‘LANG’: ‘en_US.UTF-8’, ‘CONDA_PREFIX_1’: ‘/home/u01/mambaforge’, ‘LC_TELEPHONE’: ‘ur_PK’, ‘LS_COLORS’: ‘rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.zst=01;31:.tzst=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.wim=01;31:.swm=01;31:.dwm=01;31:.esd=01;31:.jpg=01;35:.jpeg=01;35:.mjpg=01;35:.mjpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:’, ‘CONDA_PYTHON_EXE’: ‘/home/u01/mambaforge/bin/python’, ‘LC_NAME’: ‘ur_PK’, ‘SHELL’: ‘/bin/bash’, ‘LESSCLOSE’: ‘/usr/bin/lesspipe %s %s’, ‘CONDA_DEFAULT_ENV’: ‘metaphlan’, ‘LC_MEASUREMENT’: ‘ur_PK’, ‘LC_IDENTIFICATION’: ‘ur_PK’, ‘PWD’: ‘/home/u01/Preprocess/Taxonomy_2’, ‘CONDA_EXE’: ‘/home/u01/mambaforge/bin/conda’, ‘SSH_CONNECTION’: ‘10.7.41.3 59826 10.19.10.58 22’, ‘XDG_DATA_DIRS’: ‘/usr/local/share:/usr/share:/var/lib/snapd/desktop’, ‘LC_NUMERIC’: ‘ur_PK’, ‘LC_PAPER’: ‘ur_PK’, ‘CONDA_PREFIX’: ‘/home/u01/mambaforge/envs/metaphlan’, ‘TMPDIR’: ‘/tmp’}

[e] Command ‘[’/home/u01/mambaforge/envs/metaphlan/bin/mafft’, ‘–quiet’, ‘–anysymbol’, ‘–thread’, ‘1’, ‘–auto’, ‘output/tmpw6i4h70o/markers/649910227576.fna’]’ returned non-zero exit status 1.

[e] error while aligning
{‘program_name’: ‘/home/u01/mambaforge/envs/metaphlan/bin/mafft’, ‘params’: ‘–quiet --anysymbol --thread 1 --auto’, ‘version’: ‘–version’, ‘command_line’: ‘#program_name# #params# #input# > #output#’, ‘environment’: ‘TMPDIR=/tmp’}
output/tmpw6i4h70o/markers/649910227576.fna
/home/u01/Preprocess/Taxonomy_2/output/tmpw6i4h70o/msas
649910227576.aln

[e] Command ‘[’/home/u01/mambaforge/envs/metaphlan/bin/mafft’, ‘–quiet’, ‘–anysymbol’, ‘–thread’, ‘1’, ‘–auto’, ‘output/tmpw6i4h70o/markers/649910227576.fna’]’ returned non-zero exit status 1.

[e] msas crashed
Sun Jan 7 16:39:22 2024: [Error] An error was ocurred executing a external tool, exiting…
Sun Jan 7 16:39:22 2024: Stop StrainPhlAn execution.

The script which is being run for this step is:

mkdir -p output
strainphlan
-s consensus_markers/*.pkl
-m db_markers/t__SGB1934.fna
-r reference_genomes/parabacteroides.fna.bz2
-d ~/Databases/mp21
-o output
-n 4
-c t__SGB1934
–mutation_rates
–sample_with_n_markers 0
–marker_in_n_samples 0

The output which i get after running this script include:

(metaphlan) u01@ngsa:~/Preprocess/Taxonomy_2/output$ ls
tmpw6i4h70o t__SGB1934.polymorphic
(metaphlan) u01@ngsa:~/Preprocess/Taxonomy_2/output$ cd tmpw6i4h70o/
(metaphlan) u01@ngsa:~/Preprocess/Taxonomy_2/output/tmpw6i4h70o$ ls
blastn map_dna markers_dna phylophlan.cfg t__SGB1934.fna
clean_dna markers msas t__SGB1934 t__SGB1934.StrainPhlAn4

I have tried the latest database that is vOct 22 and the same error is occuring in that too. Now, I also tried the vJan 21 database in which i got the error again which is shown above. Also, let me know how to get the reference genome for this step as i got the reference genome from NCBI for parabacteroides_distasonis as this specie is abundant in my samples with clade t–SGB1934.

One more thing, when i tried to run my samples with the SGB (t–SGB1877.fna) and reference genome (G000273725.fna.bz2) used in the strainphlan 4 tutorial (whose link is given in above), it ran smoothly but when i try to run with my own extracted SGBs and their refernce genomes it gives error.

Kindly, help me to solve this error as soon as possible to complete my work on time.

Thank you!
Bioinformatics Researcher
M. Faheem R.

Hello Faheem,

could you try running the command that crashed? (removing the --quiet argument)

/home/u01/mambaforge/envs/metaphlan/bin/mafft --anysymbol --thread 1 --auto output/tmpw6i4h70o/markers/649910227576.fna

Alternatively, you could consider upgrading to strainphlan 4.1 where the mafft step is skipped altogether and the possible errors should be more transparent.

Best
Michal

Dear @Michal_Puncochar I was wondering if strainphlan 4.1 is compatible with the June version database?