Unexpected features in the metatranscriptome in combined MTX/MGX analysis

Dear Humann Team,

I am trying to perform a combined analysis of metatranscriptomes and metagenomes. As suggested in the user manual, I am using the taxonomic profiles (bugs_list.tsv) from the metagenome as the basis for mapping the metatranscriptome using the following command:
conda run -n biobakery humann --input $sample"_merge.fq" --output output_humann --taxonomic-profile /path/to/metatranscriptomics/tax_profiles/${sample/R/D}"merge_metaphlan_bugs_list.tsv" --threads 16

When looking at the results table, however, around 6% of the features from the RNA data (gene families) do not get mapped in the metagenome. Is this the expected behaviour? Thank you for your input.

I am using Humann v3.7 and Metaphlan v4.0.6 with the database version vOct22.

Yes, this is definitely possible for species of intermediate coverage. Since reads are sampled randomly, it’s possible for such a species to be detected and mapped against with your DNA reads but to not have all of its genes detected. A gene that was present but not detected in the DNA might be detected in the corresponding RNA either by chance (e.g. a few RNA reads got sampled from the gene’s transcript while no DNA reads were sampled from the gene) or because the gene is highly expressed (meaning it was easier to sample RNA reads from the gene compared to DNA reads relative to its source species’ abundance).

In these cases we would infer that the gene was present on the basis of the RNA evidence.

1 Like

Thank you for the detailed and fast answer! It is quite understandable and clear for me now as well, just wanted to be sure, there is nothing wrong on my end here.