I am using Maaslin2 to analyze 16S count data and metadata. The input consists of absolute counts only, only integer values.
Therefore, I want to use "NEGBIN or “ZINB” as analysis method and "TMM or “CSS” for normalization.
However, this leads to the following warning:
simpleWarning in glmmTMB::glmmTMB(formula, data = data, family = glmmTMB::nbinom2(link = “log”), : non-integer counts in a nbinom2 model
I don’t receive this warning when I use “None” for normalization. Thus, TMM must produce no normalized counts? I am confused as to why and how to fix the issue.
The TMM normalization causes all values to be normalized between 0 and 1 so that the total sum of a sample adds up to 1. This can be thought of as “percentages”. Therefore when using TMM normalization you are introducing non-integer counts into your data.
many thanks!
Although that is also what I thought, I am still confused to some extent. As stated in the Maaslin2 wiki page under 3.2 in the table, it is recommended to utilize the ZINB model with either the normalization TMM or CSS. But given the ZINB model needs integers and the normalization produces non-integers, it can’t really work? What would you recommend?
Hi @theresa - you are correct that you need count integers as input to be able to use these count models. This is a confusion well-addressed in the literature. One workaround would be to compute the normalization/size factors from TMM or CSS externally and add that as a column in the metadata. MaAsLin2 can then be on the original 16S counts and you will be able to use one of the count models while also adjusting for the library size.
Yes, it makes sense, so basically using “normalized counts”, confusing indeed.
In my case, the utilization of the CPLM model and TSS normalization appears to be effective as well.