How to interprete Maaslin output?

DEEPCHANDA7 · May 22, 2021, 2:34pm

Hi- @sma @himel.mallick @Kelsey_Thompson I want to find differentially abundant bacteria associated with disease samples. My metadata is something like following:

ID	Type	Age	Comorbidity	Gender
SRP090252	disease	71.65	T2D	Female
SRP090268	control	70.05	T2D	Female
SRP090265	control	70.85	NGT	Female
SRP090264	control	70.21	IGT	Female
SRP090263	control	70.06	NGT	male
SRP090258	disease	69.53	NGT	Female
SRP090250	disease	70.02	IGT	Female
SRP090244	disease	70.07	IGT	male
SRP090242	control	69.78	NGT	Female
SRP090227	control	69.74	NGT	male
SRP090226	control	69.7	NGT	Female
SRP090224	disease	69.67	NGT	male
SRP090222	disease	71.4	IGT	Female
SRP090217	control	71.53	NGT	Female

Now, I want to get the differentially abundant bacteria associated with disease samples after adjusting for confounders Age, Comorbidity, and Gender. In this context, how should I specify the Random effect and Fixed effect options?

As much as I got to know from other posts in this forum, I think I should provide --fixed-effect "Type" and --random-effect "ID". Is this correct? Will this give me an output after adjusting for all the confounders I mentioned?
Or, should I provide --fixed effect "Type, Age, Comorbidity, Gender"? and --random effect "ID"?

If none of the above correct, please suggest me what should I follow?

many Thanks,
DC7

himel.mallick · May 22, 2021, 2:49pm

Hi @DEEPCHANDA7 - it’s the latter one based on your description. If you do not have repeated ID (which seems to be the case in your example), you don’t need a random effect.

All the best,
Himel

DEEPCHANDA7 · May 22, 2021, 7:12pm

Thanks a lot, sir @himel.mallick for your fast response. Sir, I have two follow up queries-

Yes, i don’t have a repeated ID. Do you mean that I should use ID as neither random effect nor fixed effect?

I am having hard time interpreting my output. Here’s my code, metadata, and output:

Code:
library(Maaslin2)
library(readr)
setwd(“~/my_output”)
getwd()

input_data <- as.data.frame(read_tsv("merged_abundance_species_5.tsv"))
input_metadata <-as.data.frame(read_csv("metadata.csv"))
fit_data <- Maaslin2(
  input_data, input_metadata, 'my_output', transform = "NONE",
  fixed_effects = c('type', 'comorbidity', 'age'),
  reference = c('comorbidity,NGT'),
  normalization = 'NONE',
  min_abundance = 0.0,
  max_significance = 0.25,
  min_prevalence = 0.0,
  standardize = FALSE)

Metadata:

ID	type	age	comorbidity
SR090136	disease	70.15	IGT
SR090140	disease	70.15	T2D
SR090147	control	71.39	NGT
SR090150	disease	71.58	IGT
SR090152	disease	71.24	T2D
SR090154	disease	71.04	IGT
SR090157	disease	70.96	IGT
SR090161	disease	70.39	T2D
SR090163	disease	70.14	NGT
SR090167	disease	70.97	T2D
SR090169	disease	68.96	T2D
SR090170	control	71.11	NGT
SR090171	control	70.55	NGT
SR090173	disease	70.11	T2D
SR090175	disease	71.02	NGT
SR090177	disease	70.2	IGT
SR090179	disease	70.16	T2D
SR090193	control	70.25	NGT
SR090205	control	70.63	NGT

Output:

significant_results.tsv

feature	metadata	value	coef	stderr	N	N.not.0	pval	qval
Faecalitalea_cylindroides	comorbidity	IGT	-0.377764650707978	0.043253216559351	19	5	4.86003286684536E-07	0.000209953419848
Faecalitalea_cylindroides	comorbidity	T2D	-0.368187348493346	0.041715679009967	19	5	4.28479422114103E-07	0.000209953419848
Faecalitalea_cylindroides	type	disease	0.350656650707978	0.043253216559351	19	5	1.17270795116608E-06	0.000337739889936
Collinsella_stercoris	comorbidity	IGT	-0.109236524922356	0.025087649856961	19	18	0.00066062321169	0.142694613724944
Roseburia_inulinivorans	age	age	-1.58382545890012	0.385112459302547	19	18	0.001055772797088	0.15082184993556
Ruminococcus_bromii	type	disease	18.7382134402279	4.63755246673051	19	11	0.001215556053757	0.15082184993556
Ruminococcus_bromii	comorbidity	IGT	-18.7258134402279	4.63755246673051	19	11	0.0012219362842	0.15082184993556
Olsenella_scatoligenes	comorbidity	IGT	-0.014838219463708	0.003842430774736	19	8	0.001727334716805	0.178134499426023
Oscillibacter_sp_57_20	age	age	-0.310432279358586	0.081151146011419	19	11	0.001855567702354	0.178134499426023
Bifidobacterium_bifidum	type	disease	3.61449351479	1.13571553919717	19	8	0.006646997243247	0.205410860127932
Olsenella_scatoligenes	type	disease	0.012930219463708	0.003842430774737	19	8	0.004622838288644	0.205410860127932
Olsenella_scatoligenes	comorbidity	T2D	-0.01242486374167	0.003705842514556	19	8	0.004737715789044	0.205410860127932
Collinsella_intestinalis	comorbidity	IGT	-0.020799960994458	0.006581295428193	19	16	0.006945621531625	0.205410860127932
Collinsella_stercoris	type	disease	0.080588524922356	0.025087649856961	19	18	0.006265644670361	0.205410860127932
Collinsella_stercoris	comorbidity	T2D	-0.079481101360263	0.024195850200212	19	18	0.005422718327775	0.205410860127932
Alistipes_onderdonkii	comorbidity	T2D	-0.096692174486469	0.030877949038327	19	3	0.007358245481338	0.205410860127932
Clostridium_sp_CAG_411	type	disease	0.015964789965694	0.005060519673996	19	1	0.007024676313767	0.205410860127932
Clostridium_sp_CAG_411	comorbidity	IGT	-0.015964789965694	0.005060519673996	19	1	0.007024676313768	0.205410860127932
Clostridium_sp_CAG_411	comorbidity	T2D	-0.017198735694422	0.004880631572321	19	1	0.003371414639798	0.205410860127932
Lawsonibacter_asaccharolyticus	type	disease	0.477862895042796	0.151766856842129	19	18	0.007110491107549	0.205410860127932
Lawsonibacter_asaccharolyticus	comorbidity	IGT	-0.475124895042795	0.151766856842129	19	18	0.007370065583294	0.205410860127932
Eubacterium_eligens	age	age	-1.2467673088303	0.391416017038458	19	17	0.006611343727095	0.205410860127932
Oscillibacter_sp_CAG_241	comorbidity	IGT	-2.64857972770573	0.763998086959495	19	17	0.003776853512783	0.205410860127932
Oscillibacter_sp_CAG_241	comorbidity	T2D	-2.61370728597232	0.736839973880289	19	17	0.003218875403587	0.205410860127932
Ruminococcaceae_bacterium_D16	comorbidity	T2D	-0.106504807911724	0.031556086714996	19	4	0.004531946745981	0.205410860127932
Ruminococcaceae_bacterium_D5	comorbidity	IGT	-0.158426557912691	0.050505479793602	19	4	0.007279877600031	0.205410860127932
Ruminococcaceae_bacterium_D5	comorbidity	T2D	-0.170497579432457	0.048710143450794	19	4	0.003533466593531	0.205410860127932
Ruminococcus_bromii	comorbidity	T2D	-14.530958484276	4.4726997315575	19	11	0.005826462285798	0.205410860127932
Firmicutes_bacterium_CAG_534	type	disease	0.001257422965504	0.000374853309185	19	1	0.004722065633165	0.205410860127932
Firmicutes_bacterium_CAG_534	comorbidity	IGT	-0.001257422965504	0.000374853309185	19	1	0.004722065633165	0.205410860127932
Firmicutes_bacterium_CAG_534	comorbidity	T2D	-0.001166019578191	0.000361528264616	19	1	0.006106036469878	0.205410860127932
Firmicutes_bacterium_CAG_95	age	age	-0.058657135094159	0.018943043595487	19	7	0.007887029446352	0.206496770959043
Phascolarctobacterium_succinatutens	age	age	-0.282982879719054	0.091233077220782	19	2	0.00780509739456	0.206496770959043
Bifidobacterium_bifidum	comorbidity	IGT	-3.49699551479	1.13571553919717	19	8	0.008164122157298	0.207464751291347
Bacteroides_ovatus	age	age	-0.790359006180556	0.265922716314961	19	16	0.010093623789579	0.249168312977036

Here, I just want to get the bacteria that are significantly associated with the control and disease group after adjusted for confounders. What are those? Are they all in this significant_results.tsv output? Or, only those with denoted with disease under the value column?

thanks,
DC7

himel.mallick · May 23, 2021, 7:09am

Hi @DEEPCHANDA7 - that’s correct. you don’t need ID in the model. You will be looking at the metadata column to filter this table by the main variable of interest (i.e. type).

Since your metadata variable (type) is binary, MaAsLin 2 internally models this as a dummy variable with type = control as the reference level (as a rule of thumb, the reference level usually does not appear in the value column).

In your case, the results indicate that the coefficients (and their signs) should be interpreted with type = control as the reference e.g. feature X is more abundant in type = disease as compared to type = control (if the corresponding coefficient is positive) and vice versa. I hope it makes sense.

For more information on how to interpret binary categorical variables in a regression setting, feel free to check out the tutorial here. Also the visualization plots should provide sufficient information to connect the dots.

All the best,
Himel

Topic		Replies	Views
MaAsLin2 significant results from linear model MaAsLin	1	994	October 2, 2020
MaAsLin2 for Differential Abundance Analysis in a Repeated Measures Microbiome Study with Multiple Covariates MaAsLin	1	36	May 28, 2025
Confounding factors MaAsLin	14	5838	October 30, 2020
Question about reference parameter in maaslin3 MaAsLin	5	51	May 7, 2025
Study Confounder Specification in MaAsLin2 MaAsLin	1	26	April 11, 2025

How to interprete Maaslin output?

Related topics