Hello, so I am new to this and was wondering if I have to subsample before running MetaPhlAn3?
From what I understand I do not have to subsample before running HumanN3 as it has a built-in utility script (humann_renorm_table) either “relab” (relative abundance) or “cpm” (copies per million).
I would like to run MetaPhlAn3 and plot a graph comparing the taxonomic profile between different groups.
Not exactly, the diversity of your samples is independent of sequencing depth, however, sequencing depth can create a bias in the sequences you can recover, especially if you are interested in microdiversity (those microbes that <1%).
Let’s take the case you have a sample with “regular” diversity in which the microbes have similar abundance. In this case, your sequencing depth won’t affect the diversity you detect because all markers will be sequenced approx equally.
In another case, you could a sample with low diversity in which a microbe has a really high rel. abundance (<70% or similar). If you used a “normal” sequencing depth then you might not be able to detect those very low abundant microbes just for the fact that the chances of sequencing their gene markers are low because of their lo r.a.
Now if would increase the sequencing depth of that sample in order to better detect those low abundance microbes, you might find that some of the gene markers for the high abundance microbes were sequenced twice, artificially increasing the rel. abundance of that microbe. In this case subsampling your sequences could be of use, however, this is hard to detect before hand unless you have previous information on your samples (i.e. amplicon sequencing, or high duplication ratios during your quality analysis). This workshop video has some info on this around minute 11.
I have downloaded some SRR files with high depth (from this work: Impact of sequencing depth on the characterization of the microbiome and resistome | Scientific Reports) in order to understand the differences between high depth sequencing files and my “normal” files. I have used the same workflow and command lines, and it turns out that Metaphlan 3, for some reason, didn’t work with the high depth SRR files. I have no errors nor results.
Alexis, do you think using the minimal quality filters can make the SRR files able to be processed by Metaphlan 3? If yes, any idea how can I do it?