How to define the transformation / the normalization to use in Maaslin2

NicolasB · May 21, 2021, 3:54pm

Hi,

Thank you very much for this great tool to achieve easily multivariate analyzes in microbiome studies.

I still do not fully understand which parameters in my dataset can help me to choose the good method of normalization or or transformation.

I used rarefied dataset with relative abundance with about 150 different samples (not related, one time point). I want to compare healthy (n=67) vs patients (n=125). In another analysis, I want also to compare within patients different metadata (with some values presented in 5 to 100 patients).

From the paper by Weiss et al, (Microbiome, 2017), I make the asumption that using rarefied data, I may not need further normalization. Do you think I am wrong ?

I tried several methods of transformation, with various results from one to another that puzzled me. Comparing to a more linear analysis with LEFSE, I found that the closer method was using AST transformation with no other normalization applied.

Do you have any insights that can give me a proper method to define properly the settings I have to use in my analysis and which parameter I have to take into account to do so?

Thank you very much,

Nicolas

himel.mallick · May 21, 2021, 5:46pm

Hi @NicolasB - although we did not include rarefaction in our own evaluation, a recent preprint concluded that MaAsLin 2 (particularly with rarefied data) could also be a reasonable choice for users looking for increased statistical power at the potential cost of more false positives.

Coming back to your question, you are right that rarefied data can be considered normalized data and likewise, you don’t need additional normalization before statistical modeling.

As for alternative models/transformations, we usually do not recommend a particular combination over another as the choice is usually problem- and data-specific. Apart from trying out various transformations with the LM models, you can also consider other non-LM models without normalization/transformation and see if that supports your hypothesis.

Check out the following discussions for some more insights:

Best,
Himel

NicolasB · May 21, 2021, 9:44pm

Thank you very much Himel ! This helps a lot!

It looks like CPLM analysis with AST transformation fits the most my hypothesis but I have few remaining naive questions:

What does bring AST transformation exactly ?
Is there specific conditions in which it has to be used/ avoid ?

Same questions for LOG and LOGIT…

Thanks a lot,

Nicolas

himel.mallick · May 22, 2021, 8:02am

Hi @NicolasB - most of these variance-stabilizing transformations (LOGIT and AST for proportions and LOG for any positive values) are applied to approximate homoscedasticity when applying linear models. For the non-LM models, you do not need this transformation as most of these GLM-based models (CPLM, NEGBIN, and ZINB) intrinsically apply a log link function by default. In other words, to maintain interpretation, transform should be set to 'NONE' for all the non-LM models.

Best,
Himel

NicolasB · May 26, 2021, 2:52pm

Thank you very much Himel ! This is very clear for me now,

Best,

Nicolas

Topic		Replies	Views
Guide for normalization/transformation for Maaslin2 Input MaAsLin	1	469	March 1, 2024
Choosing analysis method for maaslin2 MaAsLin	10	4842	May 6, 2024
How to choose the results found by different model and transformation/normalization methods? MaAsLin	2	633	June 10, 2023
MetaPhlAn3 output analysis with MaAsLin2 MaAsLin	2	133	September 25, 2024
Using Maaslin2 with centered log ratio transformation MaAsLin	2	1387	July 12, 2021

How to define the transformation / the normalization to use in Maaslin2

Related topics