MetaPhlAn4 overestimates the abundance of Veillonella rogosae

Hi MetaPhlAn developers,

I am using MetaPhlan v. 4.0.6 (mpa_vOct22_CHOCOPhlAnSGB_202212 database) to perform taxonomic profiling of the mock community “ZymoBIOMICS Gut Microbiome Standard D6331”

The relative abundance I obtain for Veillonella rogosae is much higher than expected, 34% vs. 14%, and I am wondering what the issue can be.

I have provided the MetaPhlan4 output file:

Zymo_D6331.metaphlan4.txt (18.6 KB)

Let me know if you need further information and thank you for hopefully looking into this.

Best regards.

Hi @sarmol
This is a tricky scenario to assess. The difference with the expected relative abundance can be due to so many factors, even just the DNA extraction protocol. Are similar results reported with other profilers? Did you try other metaphlan versions?

Hi Aitor,

Thank you for replying and sorry for the lates response.

Yes, I have tried different versions of metaphlan and also Kraken2/Bracken.
With metaphlan2 I get 33% of Veillonella unclassified, and with metaphlan3 I get around 6-9% of other Veillonella species (V. dispar, V. infantium, and V. parvula). It looks like the V. rogosae sequences in NCBI are fairly recent, so perhaps this species was not included in the metaphlan2 and 3 databases.
With Kraken2/Bracken I get around 22% relative abundance of Veillonella rogosae, so also more than the expected 14%, but still lower than with metaphlan4.
I just ran the programs on two newly sequenced Zymo samples, and for these I get ~23% and 27% V. rogosae with metaphlan4, and ~17% and 19% with kraken2/Bracken. It might be that it is skewed by the DNA extraction since our results also vary, I just don’t see any obvious pattern in for instance higher or lower abundances for gram positive vs. gram negative, and we get around 35-60% higher relative abundance with metaphlan than kraken2. We also see some species being underestimated, so perhaps this also makes the overestimation look even larger.