MMUPHin lm_meta: Questions regarding input file and normalization method

1112 · October 16, 2022, 1:25pm

Hello,

I’m currently trying to use MMUPHin R packages for meta-analysis of gut microbiome 16S sequencing data, and have some questions regarding lm_meta() function.

1. Regarding an input feature table of lm_meta():
According to the nice tutorial (Performing meta-analyses of microbiome studies with MMUPHin), it seems that lm_meta() received a feature count table input which is not adjusted by MMUPHin::adjust_batch() function. (In the tutorial, the batch-adjusted feature count table is named as “CRC_abd_adj” and the not-adjusted (naive) table is named as “CRC_abd”. lm_meta() received “CRC_abd” as input in the tutorial).
Is lm_meta() function designed to receive the not-adjusted feature count table?
If it is, does it have an internal batch effect adjustment process?
Further, if I use the batch-adjusted table from adjust_batch() function for the input of lm_meta(), are there any possible problems? (e.g. violence of any assumptions of statistical models or biased results)

Regarding the normalization method of lm_meta().
Thanks to MaAsLin2.0 tutorial (MaAsLin2 · biobakery/biobakery Wiki · GitHub), I understand that the different normalization method should be selected according to the model selected.
I’m planning to use LM model, but I’m not sure which normalization method is the most appropriate.
As I know, currently, log-ratio transformation methods (e.g. CLR, ALR) are spotlighted in the microbiome field because of the compositionality nature of the sequencing data.
In the case of LM model in MaAsLin2 package, which is the most appropriate method for normalization? (TSS? or CLR?)

Thanks for your support!
Best,

andrewGhazi · October 17, 2022, 4:11pm

Regarding question 1, if you look at the source code (below), you’ll see that lm_meta() first runs MaAsLin2 on the data from each batch, then aggregates the fits with a meta-analysis method. So neither adjusted/unadjusted input will give you “wrong” results, but the full analysis pipeline is a bit simpler with unadjusted.

Regarding question 2, absent any reason to use alternative methods, we suggest sticking with the default TSS normalization.

github.com

biobakery/MMUPHin/blob/master/R/lm_meta.R#L188


      
                      paste(lvl_batch[ind_exposure & 
                                        ind_covariates_random[, covariate]],
                            collapse = ", "))
          }
          
          
# Create temporary output for Maaslin output files
          dir.create(control$output, recursive = TRUE, showWarnings = FALSE)
          
          
# Fit individual models
          maaslin_fits <- list()
          for(i in seq_len(n_batch)) {
            i_batch <- lvl_batch[i]
            if(!ind_exposure[i_batch]) next
            if(verbose) message("Fitting Maaslin2 on batch ", i_batch, "...")
            i_feature_abd <- feature_abd[, var_batch == i_batch, drop = FALSE]
            i_data <- df_meta[var_batch == i_batch, , drop = FALSE]
            i_covariates <- covariates[ind_covariates[i_batch, , drop = TRUE]]
            i_covariates_random <- covariates_random[
              ind_covariates_random[i_batch, , drop = TRUE]]
            i_output <- paste0(control$output, "/", i_batch)
            dir.create(i_output, showWarnings = FALSE)

github.com

biobakery/MMUPHin/blob/master/R/helpers_lm_meta.R#L246


      
          #' @param output directory for the output forest plots.
          #' @param rma_conv rma threshold control.
          #' @param rma_maxit rma maximum iteration control.
          #' @param verbose should verbose information be printed.
          #'
          #' @return a data frame recording per-feature meta-analysis association results.
          #' (coefficients, p-values, etc.)
          #' @keywords internal
          #' @importFrom grDevices dev.off pdf
          #' @importFrom stats p.adjust
          rma_wrapper <- function(maaslin_fits, 
                                  method = "REML",
                                  output = tempdir(),
                                  forest_plot = NULL, 
                                  rma_conv = 1e-6,
                                  rma_maxit = 1000,
                                  verbose = TRUE) {
            lvl_batch <- names(maaslin_fits)
            n_batch <- length(lvl_batch)
            exposure <- unique(maaslin_fits[[1]]$metadata)
            values_exposure <- unique(maaslin_fits[[1]]$value)

1112 · October 18, 2022, 8:14am

Thanks a lot! All my questions are solved.

Best,

Topic		Replies	Views
Error in paste(lvl_batch[ind_exposure & !ind_exposure_cat], collapse = ", ") : object 'lvl_batch' not found MMUPHin	3	600	September 17, 2021
Maaslin2 vs MMUPHin MaAsLin	4	952	December 7, 2022
Input Data for MMUPHin MMUPHin	3	400	November 15, 2021
MMUPHin lm_meta: Questions regarding warning messages and forest plot generate MMUPHin	0	169	February 6, 2024
About the MMUPHin category MMUPHin	0	547	November 4, 2019

MMUPHin lm_meta: Questions regarding input file and normalization method

Related topics