I’m currently trying to use MMUPHin R packages for meta-analysis of gut microbiome 16S sequencing data, and have some questions regarding lm_meta() function.
1. Regarding an input feature table of lm_meta():
According to the nice tutorial (Performing meta-analyses of microbiome studies with MMUPHin), it seems that lm_meta() received a feature count table input which is not adjusted by MMUPHin::adjust_batch() function. (In the tutorial, the batch-adjusted feature count table is named as “CRC_abd_adj” and the not-adjusted (naive) table is named as “CRC_abd”. lm_meta() received “CRC_abd” as input in the tutorial). Is lm_meta() function designed to receive the not-adjusted feature count table?
If it is, does it have an internal batch effect adjustment process?
Further, if I use the batch-adjusted table from adjust_batch() function for the input of lm_meta(), are there any possible problems? (e.g. violence of any assumptions of statistical models or biased results)
Regarding the normalization method of lm_meta().
Thanks to MaAsLin2.0 tutorial (MaAsLin2 · biobakery/biobakery Wiki · GitHub), I understand that the different normalization method should be selected according to the model selected.
I’m planning to use LM model, but I’m not sure which normalization method is the most appropriate.
As I know, currently, log-ratio transformation methods (e.g. CLR, ALR) are spotlighted in the microbiome field because of the compositionality nature of the sequencing data. In the case of LM model in MaAsLin2 package, which is the most appropriate method for normalization? (TSS? or CLR?)
Regarding question 1, if you look at the source code (below), you’ll see that lm_meta() first runs MaAsLin2 on the data from each batch, then aggregates the fits with a meta-analysis method. So neither adjusted/unadjusted input will give you “wrong” results, but the full analysis pipeline is a bit simpler with unadjusted.
Regarding question 2, absent any reason to use alternative methods, we suggest sticking with the default TSS normalization.