Maaslin2 vs MMUPHin

rahel31 · December 1, 2022, 8:04pm

Hi, I am a new user of Maaslin2 and MMUPHin packages, and I’m not sure which one to use - as well I’m confused why the results are different.
I have 16S amplicon sequencing data, from humans. I have more than 300 samples, sequenced in 4 different runs (a few months appart and had varying quality). I would like to as well diminish the confounding effect of age of the person.
I am interested in a variable (“var”) that has 3 levels, but as it seems that MMUPHin cannot deal with more than two, then I have started with 2 levels for now.
I have count data that has been filtered to have only taxa that are present in at least 5% of the samples.
For Maaslin2 I run:

fit_data_TSS_AST = Maaslin2(
input_data = df_input_data,
input_metadata = df_input_metadata,
min_prevalence = 0.05,
normalization = “TSS”,
transform = ‘AST’,
output = “Maaslin2_TSS_AST_var_Age”,
random_effects = “Plate”,
fixed_effects = c(“var”,“Age”),
reference = “var,e3”)

For MMUPHin

fit_lm_meta ← lm_meta(feature_abd = otu_table,
batch = “Plate”,
exposure = “var”,
covariates = “Age”,
data = metadata,
control = list(verbose = FALSE))

By MMUPHin Variance explained by run (Plate) before:

       Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)

Plate 3 4.347 1.44898 4.3819 0.05193 0.001 ***
Residuals 240 79.361 0.33067 0.94807
Total 243 83.708 1.00000

after:

       Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)

Plate 3 3.338 1.11259 3.3757 0.04049 0.001 ***
Residuals 240 79.102 0.32959 0.95951
Total 243 82.439 1.00000

The results from Maaslin2 (significant results , under 0.05 for q-value):

feature	metadata	value	coef	stderr	N	N.not.0	pval	qval
X64f3a773c250e16004e1546f7a504082	Age	Age	-0.0021747113575296	0.000428073175218923	244	38	7.57669310129663E-07	0.00140168822373988
X943b6182fabfe1d061b8b5a78dd03ba2	var	e2	0.0032969691011505	0.000719395197448784	244	14	7.41129764561043E-06	0.00685545032218964
X94b51e4d3cd9c11f4295019c56c54fff	Age	Age	-0.00441241349809916	0.00102088071466314	244	36	2.26589633545076E-05	0.0137368573040787
d485d060694e34bea416f2718b78916d	var	e2	0.00251710609189474	0.000591242182044747	244	16	0.0000297013130899	0.0137368573040787
X414ea75fb1462d10153c7bed625b4b99	Age	Age	-0.00549980227032218	0.00135891392207732	244	85	6.99837060011561E-05	0.0161837320127673
X4d6fe682a4dd9aad8decfab830a193e0	var	e2	0.0305958073812541	0.0074457626693776	244	172	5.4686780910475E-05	0.0161837320127673
dfa833b266bd2993b86feab3617b34c3	var	e2	0.0856153563649985	0.020827311368041	244	113	5.43566241897518E-05	0.0161837320127673
X85e5f4133e5ea47da3e24828bdf463f5	Age	Age	-0.0202145237636325	0.00495487202855318	244	145	6.14798474737681E-05	0.0161837320127673
X6630583dd78eb092ece5e17515eb301d	Age	Age	-0.00287220450181886	0.000738457915721947	244	88	0.00013031433581666	0.0267868356956468
bf37c87f8841a9f1a2cd20db4fa18b74	Age	Age	-0.0123296167389853	0.00320981656251117	244	165	0.000157323495297528	0.0271737761445602
d6b10cc94394ef8ebb3cde7e168bad0b	var	e2	0.00517067322095915	0.00134861733482618	244	13	0.00016157380410279	0.0271737761445602
X6251bd9ebf43fae466939ab366f6e547	Age	Age	-0.0134484525431645	0.00356032198500978	244	103	0.000199970386710597	0.0308287679512171
X5e044f34a0a5ffd168ee1f5855fb99ad	Age	Age	-0.00358554288115779	0.000956653862164803	244	13	0.000223645512868537	0.0318264768312918
X6343c15ef6a0b28bb8d019ebbcd0a55a	var	e2	0.014802943061795	0.0039906998959427	244	24	0.00025819175795474	0.0341181965868764
e15b6ef1cd643dff3f0649b7baba06e8	Age	Age	-0.00767302431211445	0.00209517285329667	244	130	0.000307588745151869	0.0379359452353972
X31168bbdcf24a70a8e892927acedee65	var	e2	0.00503120048542711	0.00139140009405133	244	13	0.000365582859122471	0.0422705180860357

Results from MMUPHin (meta_fits table, first 3 and then 4d6fe682a4dd9aad8decfab830a193e0 on position 115 if sorted by q-value):

feature	exposure	coef	stderr	pval	k	tau2	stderr.tau2	pval.tau2	I2	H2	weight_PL1	weight_PL2	weight_PL3	weight_PL4	pval.bonf	qval.fdr
dfa833b266bd2993b86feab3617b34c3	e2	0.0854174143263161	0.0189210818739646	6.34949707183327E-06	4	0.000194254244816285	0.00115934198844055	0.378899260384403	13.3664774631044	1.15428759066575	28.4064769389096	29.4447803088867	13.243601959127	28.9051407930767	0.0441671016316722	0.0441671016316722
4e8b08e013947a5b90af66139033012c	e2	0.00479073978786018	0.00117065042118702	4.2697856742547E-05	2	0	4.41470455066627E-06	0.485339051662415	0	1	NA	67.4637738719965	32.5362261280035	NA	0.297006291501157	0.0644543394153818
da96578ff1ca89aed029675b4c825780	e2	0.00897873545336833	0.00211717311298607	2.22617816563329E-05	2	0	1.2990452355072E-05	0.840594251974998	0	1	NA	57.7518968829216	NA	42.2481031170784	0.154852953201452	0.0644543394153818
4d6fe682a4dd9aad8decfab830a193e0	e2	0.0313240749329559	0.0115236982582148	0.0065631980311945	4	0.00036263338927189	0.000433668319768228	0.026427971376799	68.6693789864841	3.19176565178393	26.9692047069443	24.998413065039	22.0729197113666	25.9594625166501	1	0.399936086505716

Each pipeline gives very few significant results, out of which 1 is the same (dfa833b266bd2993b86feab3617b34c3).

Questions:

Are the commands correct - removing the confounding effects of the Age and Plate?
Why is 4d6fe682a4dd9aad8decfab830a193e0 significant in Maaslin2 and not in MMUPHin - is it somehow related to the weight of each run? And the MMUPHin better deals with the differences between runs?
The MMUPHin default normalization is “TSS” and transformation “AST” - I used the same for Maaslin2, so that the results could be comparable. However, Maaslin2 rather advises to use TMM or CSS for count data. MMUPHin should be ok to use with count data - so are these normalization and transformation methods OK to use here?

Thank you very much in advance!!

andrewGhazi · December 1, 2022, 9:27pm

MMUPHin is intended for meta-analyses where the batch variable is usually indicating something like the identifier for different studies conducted by different research groups. Unless your experimental conditions changed dramatically across the 4 runs/plates, you probably want to use MaAsLin2 here.

From a modeling perspective, when using a variable as a random effect in MaAsLin2 vs as a batch variable in MMUPHin the data aren’t handled in exactly the same way, so it makes sense to me that you’re getting consistent but not exactly equal results.

MaAsLin2’s default TSS + LOG normalization and transform is probably the best starting point for 16S data.

rahel31 · December 4, 2022, 4:10pm

Thank you Andrew for your clear reply.
One more question -
Under this thread Choosing analysis method for maaslin2 it’s mentioned :

Among the normalization approaches implemented in MaAsLin 2, TMM and CSS only work on counts and they also return normalized counts unlike TSS and CLR. Therefore, if your input is count, you can use the above two normalizations (i.e., TMM, CSS, or NONE (in case the data is already normalized)) without a further transformation (i.e. transform = 'NONE').

I understand from here that with counts it would be better to use TMM or CSS with no normalization? However, you suggest that TSS with LOG should be ok with 16S count data as well ?
Thank you for helping out!

andrewGhazi · December 5, 2022, 10:32pm

It’s not that you “should” use TMM or CSS, but more that you “can”. The effect of the different choices with alternative methods is overviewed in the results section of the paper (ctrl-f “TMM”, figs S2-S5) as well as the tutorial. But generally we recommend converting counts to relative abundance without rarefaction before running MaAsLin2. That’s generally the simplest and most effective approach.

rahel31 · December 7, 2022, 10:29am

Thanks for the explication!
Have a nice day

Topic		Replies	Views
Maaslin2 output error MaAsLin	8	1774	September 5, 2020
Running MetaPhlAn3 output with MaAsLin2 and Lefse: Highly different result MaAsLin	5	2152	May 19, 2022
Confounding factors MaAsLin	14	5864	October 30, 2020
How to interprete Maaslin output? MaAsLin	3	2010	May 23, 2021
Recommanded parameters when using Maaslin2 for Metagenomics data ranging from 0 to 1 MaAsLin	0	341	May 8, 2022

Maaslin2 vs MMUPHin

Related topics