MaAslin2 statistical underpinning


Thank you for this great tool. I am a little confused about the tool’s statistical underpinnings. I see it referred to as both a multivariable and multivariate regression, which are different. Based on the code, it seems like it runs as a multivariate regression, with one predictor variable (ie, individual taxa) at a time against multiple outcome variables (metadata). If this is the case, is there a way to flip the code to run a multivariable regression, where individual taxa are the outcome/dependent variable against multiple predictor variables?

Thank you!

Hi @nutribiomes - thanks for the comments. We have attempted to clarify the difference between multivariable and multivariate modeling in our review paper (see Box 4) which has been a longstanding confusion in the field. To answer your specific question, MaAsLin 2 is actually a multivariable framework where we model the univariate outcome Y (per-feature taxa, function, or chemical abundances) and the potentially multivariate metadata X, not the other way around if that makes sense?

Hi @himel.mallick - thank you for the prompt response. I see now. So you’re treating taxa as the dependent variable against a range of predictors. If I apply a ZINB with count data, is this different than running a ZINB using standard R packages or would it yield similar results under the same conditions? For example, would there be a difference between ZNIB used with MaAsLin 2 and the R packages ZNIB and NBZIMM?

In principle, they should not be different provided the other parameter settings are exactly the same.

Thank you @himel.mallick - you’ve been very helpful!

1 Like