Processing samples... error during StrainPhlAn 4 run

Question: I am getting the following error when running StrainPhlAn 4. I had some problems in step 5 "build the multiple sequence alignment and the phylogenetic tree”,Where could be a problem? What I should search for?

Database: I’m using sample data(

Command: strainphlan -s consensus_markers/*.pkl -m db_markers/t__SGB1877.fna -r reference_genomes/G000273725.fna.bz2 -o output -n 8 -c t__SGB1877 --sample_with_n_markers 0 --marker_in_n_samples 0

Mon Apr 17 16:04:33 2023: Start StrainPhlAn 4.0.6 execution
Mon Apr 17 16:04:33 2023: Creating temporary directory…
Mon Apr 17 16:04:33 2023: Done.
Mon Apr 17 16:04:33 2023: Filtering markers and samples…
Mon Apr 17 16:04:33 2023: Getting markers from main samples…
Mon Apr 17 16:04:33 2023: Done.
Mon Apr 17 16:04:33 2023: Getting markers from main references…
Warning: [blastn] Examining 5 or more matches is recommended
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Removing bad markers / samples…
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Getting markers from secondary samples and references…
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Writing samples as markers’ FASTA files…
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Writing filtered clade markers as FASTA file…
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Calculating polymorphic rates…
Mon Apr 17 16:04:34 2023: Done.
Mon Apr 17 16:04:34 2023: Executing PhyloPhlAn…
Mon Apr 17 16:04:34 2023: Creating PhyloPhlAn database…
Mon Apr 17 16:04:35 2023: Done.
Mon Apr 17 16:04:35 2023: Generating PhyloPhlAn configuration file…
Mon Apr 17 16:04:35 2023: Done.
Mon Apr 17 16:04:35 2023: Processing samples…

[e] Command ‘[’/bin/mafft’, ‘–quiet’, ‘–anysymbol’, ‘–thread’, ‘1’, ‘–auto’, ‘output/tmpom0svapv/markers/848025373357.fna’]’ returned non-zero exit status 1.

[e] error while aligning
command_line: /bin/mafft --quiet --anysymbol --thread 1 --auto output/tmpom0svapv/markers/848025373357.fna
stdin: None
stdout: /StrainPhlAn_test/output/tmpom0svapv/msas/848025373357.aln
command_line: /bin/mafft --quiet --anysymbol --thread 1 --auto output/tmpom0svapv/markers/848025373357.fna
stdin: None
stdout: /StrainPhlAn_test/output/tmpom0svapv/msas/848025373357.aln

[e] error while aligning

[e] error while aligning
{‘program_name’: ‘/bin/mafft’, ‘params’: ‘–quiet --anysymbol --thread 1 --auto’, ‘version’: ‘–version’, ‘command_line’: ‘#program_name# #params# #input# > #output#’, ‘environment’: ‘TMPDIR=/tmp’}

[e] msas crashed

[e] msas crashed
Mon Apr 17 16:04:37 2023: [Error] An error was ocurred executing a external tool, exiting…
Mon Apr 17 16:04:37 2023: Stop StrainPhlAn execution.


I see two problems with the analysis:

  1. The tutorial read files were filtered out for speed up purposes to contain only reads mapping against the Jan21 markers. I see that in your case you are running it against Oct22. As the markers between versions might change, it can produce slightly different results or even not work.
  2. The --sample_with_n_markers and --marker_in_n_samples parameters are set up to 0%, so you are adding empty markers and empty samples to the msa that might be producing the mafft error you are seeing.