How suitable is MetaPhlAn for metatranscriptomics analysis? The marker genes of each species are derived from analysis of the uniqueness of genomic sequences. If a particular species has five markers genes but only two of them are transcribed, will it report that species as being detected? More generally, has MetaPhlAn been evaluated using a commercially-available mock community? What is the algorithm’s sensitivity? I am wondering about the false negative rate.
By default, for a species to be detected more than 20% of the available markers should be present (this can be modified by using the --stat_q parameter). We did not perform any evaluation using a commercial mock community, but we did report a sensitivity analysis in the latest manuscript Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4 | Nature Biotechnology
That seems like a good default value which I imagine would work well for metatranscriptomics.