The bioBakery help forum

Choosing analysis method for maaslin2

Hello and thank you for this great analysis tool!
I am trying to run maaslin2 with R, however, I would like to try and run other methods, other than LM.
What are the considerations for the proper normalization/transformation that goes with the various methods?
For instance, when I try to run the NEGBIN model, I receive an error message that the transformation is not appropriate.
Could you please refer me to an explanation of the various methods?

Thank you.
Best,
Lena

Hi @Lena_Lapidot - apologies that we have not documented this part of the functionality well in our current MaAsLin 2 tutorial. I hope the following is helpful when choosing the right combination of statistical model, normalization, and transformation.

  • For statistical models, if your input is count, then you can use NEGBIN and ZINB, whereas, for non-count input, you can use LM and CPLM.

  • Apart from the statistical models, you need to pay close attention to whether the selected normalization and transformation options are valid with respect to the input requirement above.

  • Among the normalization approaches implemented in MaAsLin 2, TMM and CSS only work on counts and they also return normalized counts unlike TSS and CLR. Therefore, if your input is count, you can use the above two normalizations (i.e., TMM, CSS, or NONE (in case the data is already normalized)) without a further transformation (i.e. transform = 'NONE').

  • Among the non-count models, CPLM requires the data to be positive. Therefore, any transformation that produces negative values will typically NOT work for CPLM.

  • All the non-LM models use an intrinsic log link transformation due to their close connection to GLMs and they are recommended to be run with transform = 'NONE'.

  • Apart from that, LM is the only model that works on both positive and negative values (following normalization/transformation) and you have more wiggle room to vary the corresponding parameters which are typically limited for non-LM models.

I know it’s a lot of information but I hope this helps. Please let us know if you have any follow-up questions or if you encounter any issues with the alternative non-default models.

All the best,
Himel

Thank you for the great explanation, Himel!
I have the raw abundances of 16S bacterial sequencing, obtained from fecal samples. I want to analyze them collapsed at the genus level. My fixed effects all have positive values (treated as continuous variables).

What is the best approach in this case? LM runs smoothly, however, when I look at the scatterplots, some of taxa look like LM is not the best fit.
On the other hand, when I try the NEGBIN model, I get errors regarding the proper normalization and transformation needed.

Thank you,
Best,
Lena

Hi @Lena_Lapidot - as described above, for NEGBIN, you need to apply one of the three normalizations (i.e., TMM, CSS, or NONE) without a transformation (i.e. transform = 'NONE' and normalization = 'CSS’ or normalization = 'TMM' or normalization = 'NONE') in your MaAsLin 2 call. Note that, normalization = 'NONE' assumes that the data is already normalized.

Hi Himel,
I got it, it works!

Thank you :pray: :blush: