Do I have to subsample before running MetaPhlAn3?

Pluto · July 3, 2021, 4:46pm

Hello, so I am new to this and was wondering if I have to subsample before running MetaPhlAn3?

From what I understand I do not have to subsample before running HumanN3 as it has a built-in utility script (humann_renorm_table) either “relab” (relative abundance) or “cpm” (copies per million).

I would like to run MetaPhlAn3 and plot a graph comparing the taxonomic profile between different groups.

Many thanks

alexis_saldivar · July 3, 2021, 8:20pm

I always use my sequences just with the minimal quality filters, no subsampling

Pluto · July 4, 2021, 1:19pm

Thank you for getting back to me.

How do you account for sequencing depth? Won’t the samples with a high depth just have more microbes of interest?

alexis_saldivar · July 5, 2021, 10:41pm

Not exactly, the diversity of your samples is independent of sequencing depth, however, sequencing depth can create a bias in the sequences you can recover, especially if you are interested in microdiversity (those microbes that <1%).

Let’s take the case you have a sample with “regular” diversity in which the microbes have similar abundance. In this case, your sequencing depth won’t affect the diversity you detect because all markers will be sequenced approx equally.

In another case, you could a sample with low diversity in which a microbe has a really high rel. abundance (<70% or similar). If you used a “normal” sequencing depth then you might not be able to detect those very low abundant microbes just for the fact that the chances of sequencing their gene markers are low because of their lo r.a.

Now if would increase the sequencing depth of that sample in order to better detect those low abundance microbes, you might find that some of the gene markers for the high abundance microbes were sequenced twice, artificially increasing the rel. abundance of that microbe. In this case subsampling your sequences could be of use, however, this is hard to detect before hand unless you have previous information on your samples (i.e. amplicon sequencing, or high duplication ratios during your quality analysis). This workshop video has some info on this around minute 11.

Valter · July 21, 2021, 12:58am

Hi Alexis and Pluto,

I have downloaded some SRR files with high depth (from this work: Impact of sequencing depth on the characterization of the microbiome and resistome | Scientific Reports) in order to understand the differences between high depth sequencing files and my “normal” files. I have used the same workflow and command lines, and it turns out that Metaphlan 3, for some reason, didn’t work with the high depth SRR files. I have no errors nor results.

Alexis, do you think using the minimal quality filters can make the SRR files able to be processed by Metaphlan 3? If yes, any idea how can I do it?

Kind regards,
Valter

Topic		Replies	Views
Question- Do I have to subsample before running MetaPhlAn3? MetaPhlAn	1	521	July 7, 2021
Clear guidance needed for comparing across samples with varying sequencing depth MetaPhlAn	14	1374	December 1, 2022
MetaPhlAn 4 appropriate for shallow shotgun sequenced data? MetaPhlAn	1	386	May 12, 2023
MetaPhlAn3 results not giving expected profiles MetaPhlAn	1	252	May 27, 2022
Default minimum read length change MetaPhlAn	2	377	January 13, 2022

Do I have to subsample before running MetaPhlAn3?

Related topics