We just ran MetaPhlAn 3 for the first time, on a microbiome dataset where we had v2 before.
Some samples gave completely different results. For instance one sample with v2 had nearly all Betaproteobacteria (Burkholderiales) whereas v3 had entirely Bacteroidetes (Flavobacteriia). Any clue?


This could be imputed to the update and expansion of the detectable species. In v20, 122 species in Burkholderiales order and 62 in the Flavobacteriia class were detectable. Now, in v30, the database includes markers for 520 Burkholderiales and 562 Flavobacteriia.

Thanks, but why does this mean that v2 shows the whole sample is Betaproteobacteria whereas v3 shows everything is Flavobacteriia?
Is there any quality score or P value attached to MetaPhlAn classifications?
Is it possible that MetaPhlAn 2 is better than 3 for lumpers rather than splitters?
I am less concerned with exact species than with genus or family level taxa.

The only quality score available is the score associated to each marker alignment.

It can be possible, differently than MetaPhlAn 3, version 2 has also markers for non-species clades, so also family and orders.

It may be worth have a look to both the profiles, do you have any chance to upload here both the bowtie2out and the profile?

For a quick check Iā€™d also run a read classificator like Kraken2 or KrakenUniq

Hi there,
I have a similar questions. I understand the updates on classification and detectable species. However, one of my recent test run resulted in an big relative abundance difference (for example, Proteobacteria phylum is 4% in v2 but 28% in v3. Some highly abundant genera in v2, such as Alistipes and Lachnospiraceae, is no longer in v3. I wonder if you can help with potential explanation.


