Rarefy metagenome sequence data before MetaPhlAn3 analysis

Hi @DEEPCHANDA7 , a common rarefaction procedure is to rarefy each sample to the 10th percentile of the dataset, meaning that 10% of the sample will be below the threshold and will be excluded.

You can subset to the 10% percentile either the raw-reads or the bowtie alignment (and passing the rarefied bowties as input to metaphlan). I can’t think of any procedure capable of rarefying directly the taxonomic profiles.

Since doing rarefaction you will not loose 10% of the samples (in case you apply this type of rarefaction) but you will also loose great part of the sample diversity, I would personally use the rarefacted profiles just for alpha-diversity. Since 2.2 GB is a normal size and 600 MB is a medium-small size for a metagenome, in your case I would personally go with rarefaction, but just for the alpha-diversity computation.

Thanks,
Paolo

1 Like