This applies to MetaPhlAn version 4.0.6 (1 Mar 2023), installed in Aug, 3, 2023 via bioconda:
I have installed MetaPhlAn4 and checked it using the example fasta.gz data provided in your tutorial. It seems that the SRS014459-Stool.fasta.gz sequence sample does not result in profiles as shown in your examples. The other example data result in profiles as expected.
I have also checkt the sequence counts of the fasta.gz files, and I assume that they are ok.
I have uploaded the profile and bowtie2out files as a tar.gz package to the biobakery google drive as Example-mpa4-conda.tar.gz .
Can you see what the problem could be with the stool sample?
I also have the issue of example files, but my case is worse. Could you reproduce your observation using the docker image biobakery/metaphlan:4.0.2? The tutorial site says “This tutorial has been updated to work with version 4.0.2”.
Running MetaPhlAn version 4.0.6 (1 Mar 2023) with mpa_vOct22 results in all unclassified for SRS014459-Stool.fasta.gz. The same version with mpa_vJan21 returns the exact profile in the tutorial.
Additionally, with the tutorial’s grep command, the sum of all taxonomic orders’ abundances with mpa_vOct22 database is NOT 100%:
grep o__ SRS014476-Supragingival_plaque_profile.txt | grep -v f__ | cut -f1,3
@chx I also find that running on vJan21 produces the same result as the example. I also see the eclipse of the abundance sum on vOct22, and it suggests something is wrong.