Hi @fbeghini sir
I have done taxonomic profiling using MetaPhlAn3 but my question is that my input files or raw files (.fastq.gz format) are of varying sizes ranging from 200 MB to 18GB.
-
rarefy raw files?
so do you think I need to rarefy my data before doing the taxonomic profiling? if yes then what is the method to do so?
I am asking this question because after doing the analysis I got different results for different sizes of files. For e.g If the file was having 641 M Bases of sequencing yield then it gave me 62 species and when the file was having 4473 M bases sequencing yield I got 181 species for that particular file/sample. -
get the datasheet for top 100 or top 25 species
I apologise for asking question related to other topic but I need to know when we use the flag --ftop 100 or --ftop n in merged_profile.txt file we get n number of species or whatever level(phylum/class…etc) while making the heat map using #hclust2, how is it done what is the method? Do you take average of all the samples? Can I get the datasheet or a table for the top 25 features/clades?
Thanks a ton in Advance
Saraswati