Over 80% of reads were unclassified in mouse fecal samples using MetaPhlAn3

Fangxi_Xu · January 18, 2023, 6:10pm

Hi, we did shotgun metagenomics sequencing on mouse fecal samples and 2 mock community control samples which are a mixture of 20 known strains. We did quality control using Kneaddata v0.10.0 and we have over ~85% of reads retained after qc for each sample (we did trimming and decontamination to remove mouse host sequences).

Then we did taxonomy profiling using MetaPhlAn version 3.0.14 (19 Jan 2022).

The –unknown_estimation option was added when running Metaphlan to check the percentage of reads classified; however, all the mouse fecal samples had over 80% of unknown reads and the known reads were 100% classified to bacteria. The mock community control samples had only 2% of unknown reads.

I wonder is this normal for profiling mouse gut metagenome using Metaphlan? We sequenced quite deep for these samples and we were surprised to see that the majority of reads were unclassified.

Thank you so much for your help!

Best,
Fangxi

Scott · January 26, 2023, 8:28pm

I am curious about this as well.

Could the mouse gut not be well represented in the database? Is there any plan to do the following for the mouse microbiome (http://segatalab.cibio.unitn.it/data/Pasolli_et_al.html; Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle: Cell)?
Cheers!

aitor.blancomiguez · January 31, 2023, 8:53am

Hi @Fangxi_Xu and @Scott
Indeed, the mice gut microbiome is underepresented in metaphlan 3 as most of the species present has not been isolated so far and thus are not present in the reference genome databases. I suggest you to move to version 4 (https://www.biorxiv.org/content/10.1101/2022.08.22.504593v1) in which we included information from metagenomic-assembled genomes to improve the mappability of low caracterized environments as the mouse gut

Fangxi_Xu · January 31, 2023, 3:50pm

Thank you! @aitor.blancomiguez

We tried Metaphlan version 4.0.4 (17 Jan 2023) and it classified reads a lot better. Only 3%~20% reads were unclassified now.

Topic		Replies	Views
Too many unclassified reads? MetaPhlAn	4	1321	May 29, 2023
Too many SGB or other unclassified species when processing mouse metagenomic samples MetaPhlAn	3	568	February 20, 2024
Metaphlan4 doesn't classify bacteria in output MetaPhlAn	2	503	February 23, 2024
Metaphlan analysis show 100 % unknown MetaPhlAn	1	353	December 19, 2022
MetaPhlAn4 database missing Citrobacter species? MetaPhlAn	0	42	January 14, 2025

Over 80% of reads were unclassified in mouse fecal samples using MetaPhlAn3

Related topics