Hi, I was wondering if the clade specific marker genes were genes for one taxonomic level (species) or if there are also marker genes for higher taxonomic levels. I remember that in the articles its specified that the latter is true.
“Such marker genes are chosen so that essentially all of the strains in a clade (species or otherwise) possess such genes, and at the same time no other clade contains homologs close enough to incorrectly map metagenomic reads.”
“we identified more than 2 million potential markers from which we selected a subset of 400,141 genes most representative of each taxonomic unit (Online Methods). The resulting catalog spans 1,221 species with 231 (s.d. 107) markers per species and >115,000 markers at higher taxonomic levels”
However, in the metaphlan database marker info file ‘mpa_v30_CHOCOPhlAn_201901_marker_info’ I could not find any higher taxonomic level marker genes. (I searched for [‘clade’: 'g_])
I am most likely not finding them, so I would appreciate any clarification on this.
Kind regards,
Moelong Yu