I hope you are doing well amid this pandemic. I was recently introduced to your galaxy website to use the tools there—which has been really fantastic. I used the maaslin2 to analyze my metagenomics data but it only gives results in notch box plots. I have both my metagenomics data and metatranscriptomics data in table form (excel and others). What I am having a challenge with is finding a tool that can merge these two datasets so I can link the microbes to the function. I wanted to ask which of the tools from your lab’s website (not the galaxy one-but the Harvard one) would work best to do that?
Thank you in advance for your help.
HUMAnN can analyze both metagenomes and metatranscriptomes. We offer some suggestions for doing that here:
Since your samples are paired, you can look for changes in expression (RNA abundance) while controlling for gene copy number (DNA abundance). You can do this either by computing RNA/DNA ratios and modeling against those (or) by including a gene’s DNA abundance as a covariate in a model of the gene’s expression. (I say “gene” in the examples above, but the same logic applies to any function you’re interested in.)
I have a related question to this topic. I have a set of paired metagenome/metatranscriptome datasets and have already used humann3 with the setting that suggested on the humann3 user manual: image|690x117
I have two questions. When I look at the SAMPLENAME_genefamilies.tsv file, while I can see the name/identifier of the detected genes, however, I do not find any annotation or name of the genes. Any idea why is like that?
My second question is about this part of the manual,
HUMAnN 3.0 RNA-level outputs (e.g. transcript family abundance) can then be normalized by corresponding DNA-level outputs to quantify microbial expression independent of gene copy number.
Is there any scripts/command as part of humann3 to do this, or should it be done in R during the analyis?
Thanks for your support.
Mehdi
[1] By default we don’t attach names to things in HUMAnN since it can be more efficient space-wise to merge profiles and then do the naming once on the merged table. In either case, the humann_rename_table
script can carry out that task.
[2] We don’t bundle a script for RNA/DNA normalization in HUMAnN 3 since we’re actively researching new best-practices for that kind of analysis. So far, my preferred approach is to do the normalization during modeling with the DNA as a covariate, as in:
log(RNA) ~ log(DNA) + covariates
for samples with RNA > 0
Ignoring RNA zeroes is the safest way to avoid confounding species/gene loss with down-regulation, and also seems to help generally with the tricky interactions between MTX dynamic range and sequencing depth. If you need a pseudocount for the log( )s, I would use half the smallest non-zero measurement on a per-feature basis (avoid a universal small pseudocount).
1 Like