The bioBakery help forum

Number of metadata variables impact stats?


I am using Maaslin2 to analyze the association of pathway abundance (copies per million) and sample metadata using a negative binomial model. I recently added some additional variables to my metadata table and re-ran Maaslin2. I compared the significant results from my first Maaslin2 run to those generated in my second Maaslin2 run, and noticed that by adding more metadata variables the number of significant features changed. Additionally, some metadata variables that previously had significant features no longer do. Is this specific to the model I’m choosing to use or to how Maaslin2 works?

Thank you!

I think this is to be expected for running regression models. Depending on what metadata covariates are included, the model might provide different estimates and significance. One way to understand the difference is that model interpretations change with additional covariates: y ~ x1 studies the association between y and x1, but y ~ x1 + x2 studies the association between y and x1, adjusting for x2. There is no guarantee the two should provide the same results.