Hi, I am a new user of Maaslin2 and MMUPHin packages, and I’m not sure which one to use - as well I’m confused why the results are different.
I have 16S amplicon sequencing data, from humans. I have more than 300 samples, sequenced in 4 different runs (a few months appart and had varying quality). I would like to as well diminish the confounding effect of age of the person.
I am interested in a variable (“var”) that has 3 levels, but as it seems that MMUPHin cannot deal with more than two, then I have started with 2 levels for now.
I have count data that has been filtered to have only taxa that are present in at least 5% of the samples.
For Maaslin2 I run:
fit_data_TSS_AST = Maaslin2(
input_data = df_input_data,
input_metadata = df_input_metadata,
min_prevalence = 0.05,
normalization = “TSS”,
transform = ‘AST’,
output = “Maaslin2_TSS_AST_var_Age”,
random_effects = “Plate”,
fixed_effects = c(“var”,“Age”),
reference = “var,e3”)
For MMUPHin
fit_lm_meta ← lm_meta(feature_abd = otu_table,
batch = “Plate”,
exposure = “var”,
covariates = “Age”,
data = metadata,
control = list(verbose = FALSE))
By MMUPHin Variance explained by run (Plate) before:
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
Plate 3 4.347 1.44898 4.3819 0.05193 0.001 ***
Residuals 240 79.361 0.33067 0.94807
Total 243 83.708 1.00000
after:
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
Plate 3 3.338 1.11259 3.3757 0.04049 0.001 ***
Residuals 240 79.102 0.32959 0.95951
Total 243 82.439 1.00000
The results from Maaslin2 (significant results , under 0.05 for q-value):
feature | metadata | value | coef | stderr | N | N.not.0 | pval | qval |
---|---|---|---|---|---|---|---|---|
X64f3a773c250e16004e1546f7a504082 | Age | Age | -0.0021747113575296 | 0.000428073175218923 | 244 | 38 | 7.57669310129663E-07 | 0.00140168822373988 |
X943b6182fabfe1d061b8b5a78dd03ba2 | var | e2 | 0.0032969691011505 | 0.000719395197448784 | 244 | 14 | 7.41129764561043E-06 | 0.00685545032218964 |
X94b51e4d3cd9c11f4295019c56c54fff | Age | Age | -0.00441241349809916 | 0.00102088071466314 | 244 | 36 | 2.26589633545076E-05 | 0.0137368573040787 |
d485d060694e34bea416f2718b78916d | var | e2 | 0.00251710609189474 | 0.000591242182044747 | 244 | 16 | 0.0000297013130899 | 0.0137368573040787 |
X414ea75fb1462d10153c7bed625b4b99 | Age | Age | -0.00549980227032218 | 0.00135891392207732 | 244 | 85 | 6.99837060011561E-05 | 0.0161837320127673 |
X4d6fe682a4dd9aad8decfab830a193e0 | var | e2 | 0.0305958073812541 | 0.0074457626693776 | 244 | 172 | 5.4686780910475E-05 | 0.0161837320127673 |
dfa833b266bd2993b86feab3617b34c3 | var | e2 | 0.0856153563649985 | 0.020827311368041 | 244 | 113 | 5.43566241897518E-05 | 0.0161837320127673 |
X85e5f4133e5ea47da3e24828bdf463f5 | Age | Age | -0.0202145237636325 | 0.00495487202855318 | 244 | 145 | 6.14798474737681E-05 | 0.0161837320127673 |
X6630583dd78eb092ece5e17515eb301d | Age | Age | -0.00287220450181886 | 0.000738457915721947 | 244 | 88 | 0.00013031433581666 | 0.0267868356956468 |
bf37c87f8841a9f1a2cd20db4fa18b74 | Age | Age | -0.0123296167389853 | 0.00320981656251117 | 244 | 165 | 0.000157323495297528 | 0.0271737761445602 |
d6b10cc94394ef8ebb3cde7e168bad0b | var | e2 | 0.00517067322095915 | 0.00134861733482618 | 244 | 13 | 0.00016157380410279 | 0.0271737761445602 |
X6251bd9ebf43fae466939ab366f6e547 | Age | Age | -0.0134484525431645 | 0.00356032198500978 | 244 | 103 | 0.000199970386710597 | 0.0308287679512171 |
X5e044f34a0a5ffd168ee1f5855fb99ad | Age | Age | -0.00358554288115779 | 0.000956653862164803 | 244 | 13 | 0.000223645512868537 | 0.0318264768312918 |
X6343c15ef6a0b28bb8d019ebbcd0a55a | var | e2 | 0.014802943061795 | 0.0039906998959427 | 244 | 24 | 0.00025819175795474 | 0.0341181965868764 |
e15b6ef1cd643dff3f0649b7baba06e8 | Age | Age | -0.00767302431211445 | 0.00209517285329667 | 244 | 130 | 0.000307588745151869 | 0.0379359452353972 |
X31168bbdcf24a70a8e892927acedee65 | var | e2 | 0.00503120048542711 | 0.00139140009405133 | 244 | 13 | 0.000365582859122471 | 0.0422705180860357 |
Results from MMUPHin (meta_fits table, first 3 and then 4d6fe682a4dd9aad8decfab830a193e0 on position 115 if sorted by q-value):
feature | exposure | coef | stderr | pval | k | tau2 | stderr.tau2 | pval.tau2 | I2 | H2 | weight_PL1 | weight_PL2 | weight_PL3 | weight_PL4 | pval.bonf | qval.fdr |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
dfa833b266bd2993b86feab3617b34c3 | e2 | 0.0854174143263161 | 0.0189210818739646 | 6.34949707183327E-06 | 4 | 0.000194254244816285 | 0.00115934198844055 | 0.378899260384403 | 13.3664774631044 | 1.15428759066575 | 28.4064769389096 | 29.4447803088867 | 13.243601959127 | 28.9051407930767 | 0.0441671016316722 | 0.0441671016316722 |
4e8b08e013947a5b90af66139033012c | e2 | 0.00479073978786018 | 0.00117065042118702 | 4.2697856742547E-05 | 2 | 0 | 4.41470455066627E-06 | 0.485339051662415 | 0 | 1 | NA | 67.4637738719965 | 32.5362261280035 | NA | 0.297006291501157 | 0.0644543394153818 |
da96578ff1ca89aed029675b4c825780 | e2 | 0.00897873545336833 | 0.00211717311298607 | 2.22617816563329E-05 | 2 | 0 | 1.2990452355072E-05 | 0.840594251974998 | 0 | 1 | NA | 57.7518968829216 | NA | 42.2481031170784 | 0.154852953201452 | 0.0644543394153818 |
4d6fe682a4dd9aad8decfab830a193e0 | e2 | 0.0313240749329559 | 0.0115236982582148 | 0.0065631980311945 | 4 | 0.00036263338927189 | 0.000433668319768228 | 0.026427971376799 | 68.6693789864841 | 3.19176565178393 | 26.9692047069443 | 24.998413065039 | 22.0729197113666 | 25.9594625166501 | 1 | 0.399936086505716 |
Each pipeline gives very few significant results, out of which 1 is the same (dfa833b266bd2993b86feab3617b34c3).
Questions:
- Are the commands correct - removing the confounding effects of the Age and Plate?
- Why is 4d6fe682a4dd9aad8decfab830a193e0 significant in Maaslin2 and not in MMUPHin - is it somehow related to the weight of each run? And the MMUPHin better deals with the differences between runs?
- The MMUPHin default normalization is “TSS” and transformation “AST” - I used the same for Maaslin2, so that the results could be comparable. However, Maaslin2 rather advises to use TMM or CSS for count data. MMUPHin should be ok to use with count data - so are these normalization and transformation methods OK to use here?
Thank you very much in advance!!