Rarefying raw file for taxonomic classification MetaPhlAn3

saras22 · May 17, 2021, 8:14am

Hi @fbeghini sir
I have done taxonomic profiling using MetaPhlAn3 but my question is that my input files or raw files (.fastq.gz format) are of varying sizes ranging from 200 MB to 18GB.

rarefy raw files?
so do you think I need to rarefy my data before doing the taxonomic profiling? if yes then what is the method to do so?
I am asking this question because after doing the analysis I got different results for different sizes of files. For e.g If the file was having 641 M Bases of sequencing yield then it gave me 62 species and when the file was having 4473 M bases sequencing yield I got 181 species for that particular file/sample.
get the datasheet for top 100 or top 25 species
I apologise for asking question related to other topic but I need to know when we use the flag --ftop 100 or --ftop n in merged_profile.txt file we get n number of species or whatever level(phylum/class…etc) while making the heat map using #hclust2, how is it done what is the method? Do you take average of all the samples? Can I get the datasheet or a table for the top 25 features/clades?

Thanks a ton in Advance
Saraswati

saras22 · July 2, 2021, 7:32am

Hi @fbeghini
Please reply to the 1st question if you find any time because I found the answer to the 2nd question. Thanks.

Saraswati

alexis_saldivar · July 3, 2021, 8:28pm

If your samples come from different environments, then differences in diversity are to be expected. If they do not and you expect them to have a similar number of species, then this is probably caused by biases introduced by different sequencing depths, therefore it is unlikely that rarefying could improve your results.

Topic		Replies	Views
Rarefy metagenome sequence data before MetaPhlAn3 analysis MetaPhlAn	7	3079	September 26, 2022
Clear guidance needed for comparing across samples with varying sequencing depth MetaPhlAn	14	1578	December 1, 2022
Normalising the input reads of the samples MetaPhlAn	1	366	May 17, 2021
Difference in sequencing depth MetaPhlAn	0	16	February 5, 2026
Softwares for statistitical analysis of MetaPhlAn output MetaPhlAn	1	618	December 30, 2021

Rarefying raw file for taxonomic classification MetaPhlAn3

Related topics