Running MetaPhlAn3 output with MaAsLin2 and Lefse: Highly different result

DEEPCHANDA7 · November 15, 2020, 8:00am

Hi there!!!
My data is univariate like the following. I want to find out taxa associated with control and disease groups:

SampleID	type
SRR45656	control
SRR98989	disease
SRR78787	control
SRR45679	disease

I have run MetaPhlAn output with both Lefse (from galaxy) and Maaslin. I have used the command:
$ Maaslin2.R --transform=NONE --fixed_effects="type,subject" --normalization=NONE --standardize=FALSE cleaned_file_trans.tsv metadata.tsv /media/deep/New\ Volume/obesity/done/prjeb_7854_wgs_analysed/maaslin2/
Is my command correct?
I have got total 35 significant results (P<0.05) (6 disease enriched, 29 control enriched) when used LefSe. But, no significant result with MaasLin2 with the above command. When I used --transform=LOG, I got only 5 significant results. They are common with the LefSe output and all from control enriched group. Why such discrepancy I am seeing in the result? Should I stop using Maaslin and only go with LefSe or am I doing some mistake with my commands?
Please help.

Thanks,
DC7

himel.mallick · November 17, 2020, 11:11pm

Hi @DEEPCHANDA7 - it looks like you are supplying subject as a fixed effect to the MaAsLin 2 call, which is quite not the same as LefSe’s univariate approach. In addition, the results are expected to differ (for the same comparison) as the modeling paradigms are drastically different (e.g. nonparametric univariate in LefSe vs parametric multivariable in Maaslin2). Having said that, you do expect to see some consistent results (e.g. those with large enough effect sizes) but for a fair comparison, you need to make sure the p-values are comparable across models and they correspond to the same contrast (e.g. control v/s disease).

In your specific case, I would drop subject from the fixed effects and compare the p-values correspond to control/disease across models. Additionally, I might include subject as a random effect if there are repeated measures, which I cannot tell from your description.

To answer your question about LefSe v/s MaAsLin 2, please use your judgment based on the findings (e.g. biological relevance of the detected features) not on the number of significant features, which may not always correspond to the superiority of a tool over another.

DEEPCHANDA7 · November 17, 2020, 11:40pm

Sir, thanks a lot for your reply. I think I’m lacking proper insight regarding the two tools and I’ve to work on that. Will you please suggest any articles or something so that it becomes easy to understand the nitty-gritty of the tools for a student from non-statistical background?
Anyway, I have prepared my metadata in this way:

ID	Control	Obese
SRS12345	YES	NO
SRS23456	YES	NO
SRS34567	YES	NO
SRS45678	YES	NO
SRS56789	NO	YES
SRS67890	NO	YES
SRS98765	NO	YES
SRS87654	NO	YES
SRS76543	NO	YES

And, when i used random_effects="ID" --fixed_effects="control,obese", I got significantly associated taxa (in the "significant_results.tsv" output file) consistent with LefSe output., I found consistent result with Lefse output. LefSe shown 37 taxa, MaAsLin2 shown 27 (all of them are also present in LefSe output also). Do you think this approach is correct?

himel.mallick · November 18, 2020, 12:56am

Hi @DEEPCHANDA7 - you should create one single variable that includes two classes (similar to what you had before) and supply that to the fixed_effects command (it’s redundant otherwise). For introductory statistics, Modern Statistics for Modern Biology is a good start.

DEEPCHANDA7 · November 19, 2020, 7:17pm

Thanks a lot, sir for suggesting the book. I have noticed one thing when analyzed HUMAnN output data with MaAsLin2 and Lefse. After Lefse, I got around 30 significant features (P<0.05, without FDR correction). But, in MaAsLin2 output I’m getting no significant features because, the lowest q-value among all the features is 0.4 (default q-value <0.25). But, if I filter the features with P-value <0.05 from the MaAsLin2 output, I see, most of them are the same as that of Lefse output.
In this context, I am totally confused about whether I should consider those outputs from Lefse and report or not. One post in biobakery forum states adjustment is not necessary for Lefse (although @sma emphasised on “personal preferences” ). I also suspect if P-value adjustment is discarding the true significant features.
Please, suggest me what should I do.

lephuong07 · May 19, 2022, 2:02am

Hi expert,

Could you please provide the answer for this question? Could you please give more knowledge about the difference between Lefse output and MaAslin2 output?
Thank you

Topic		Replies	Views
MaAslin2 result differs from original expr and subsetted expr MaAsLin	1	27	November 26, 2024
HUMAnN3 and MetaPhlAn3 output analysis with MaAsLin2 HUMAnN	3	2648	March 14, 2022
Different results with random effect from Maaslin2 vs Maaslin3 MaAsLin	8	193	November 11, 2024
Different results running GLM and Maaslin2 using same methods/transformations MaAsLin	6	2218	December 12, 2024
Maaslin2 all-groups outcome different from paired-groups MaAsLin	2	330	November 2, 2023

Running MetaPhlAn3 output with MaAsLin2 and Lefse: Highly different result

Related topics