Output differences in number of species

Hi,

I am running MetaPhlAn3 on my dataset (from bowtie2 files) and I was hoping you could help me with something I don’t understand.

When I looked at the outputs for a given individual, I do not obtain the same number of microbial species for the different outputs.
For example I obtained 97 species in the “rel_ab_w_read_stats” file and 457 species for the “clade_profiles” file. Why could that be?

I thought this could be because of some filtering but I noticed that some species with high MetaPhlAn score in the “clade_profiles” file are not present in the “rel_ab_w_read_stats” file and on the opposit some species with a low score are present in this “rel_ab_w_read_stats” file.

If you have any explanation for that, I would appreciate it.
Thank you in advance for your answer,
Anthony

Hi Anthony,
Can I ask you if you can upload here the two files so I can have a look at them?

Hi Francesco,
Sorry for being late, here are the files I used:
BAB01__clade_profiles.txt (1.6 MB)
BAB01__relab_rstats.txt (32.4 KB)

Anthony

Hi Anthony,
Thanks for the files. What do you mean with ‘high score’?

The main reason you see a few number species in the rel_ab_w_read_stats file is because clade_profiles shows all the species having at least one marker identified, then, species are reported in rel_ab_w_read_stats only i) if there is enough marker support; ii) species profiled with non-unique markers are reported only if there are enough markers to determine the presence of one species.