Maaslin3 significant feature not shown in summary plot

dlepp · January 31, 2025, 6:43pm

I am using maaslin3 v‘0.99.1’ in Rstudio, and am having an issue where a feature is present in the significant_results.tsv file with no model errors, but is not showing up on the summary plot. I have imported the ggplot back into R and see that the feature is not present in the $data list, so it’s not an issue of it being outside the x-limits or something like that. I suspect it might be filtered out due to the high stderr - I’ve included the values below. There are three treatment groups, and the presence of the feature in each is 4/12, 4/11 and 11/11 - so clearly a higher prevalence in the last group. I’m not clear why the stderr would be so high for the prevalence test. If you could please provide any insight into whether you expect this feature to be filtered from the summary plot, I would appreciate it.

coef	stderr	pval_individual	qval_individual	pval_joint	qval_joint	model	N	N_not_zero
15.75606328	724.0777139	0.982639256	0.99547371	0.001024386	0.026634025	prevalence	34	20

WillNickols · January 31, 2025, 8:52pm

Hi,

A few thoughts:

The summary plot only reports the taxa with most significant associations (by default 25 taxa), so it’s possible that the p-value and q-value on this association might not be significant enough relative to the rest of your results.
The fact that your coefficient and standard error are both very large suggests that there is linear separability in your data (i.e., all present or all absent in one group). Indeed, if the feature is present in every sample of one group, this is a classic example of linear separability. The data augmentation scheme tries to deal with this (see the MaAsLin 3 preprint or user manual for an explanation how). Below is an example where a default logistic regression breaks on this kind of data but the augmented model gives a significant effect as intended. Have you changed the augment parameter from its default (TRUE)?
What is the full model formula you’re using? Is it just ~ groups where groups has the 3 values you’ve described?

df <- data.frame(x = c(rep(0, 12), rep(1, 12)), 
                 y = c(rep(1, 4), rep(0, 8), rep(1, 12)))
summary(glm(y ~ x, df, family = 'binomial'))
df_augmented <- rbind(df, df, df)
df_augmented$y <- c(df_augmented$y[1:24], rep(0, 24), rep(1, 24))
summary(glm(y ~ x, df_augmented, family = 'binomial', 
            weights = c(rep(1, 24), rep(1 / 48, 48))))

Will

dlepp · February 4, 2025, 9:31pm

Hi Will,

Thanks for the detailed response.

I don’t believe it is outside the 25 most significant, as there are only 10 significant associations in total.

I have tried two different models: 1) ~group and 2) ~group + (1|CageID) . This is from a rat study in which they are housed 2-3 per cage, so I have included CageID as a random effect - does this seem reasonable? I notice now that the issue only occurs with the second model. I haven’t changed the augment parameter.

Thanks again,

Dion

WillNickols · February 4, 2025, 9:54pm

Hi Dion,

In our experience, random effects with only 2-3 observations per group can produce issues for logistic mixed effect regression (see the bottom of the random effects section). If you use fixed effects instead, does that work, or does that increase the variance so much to make everything insignificant? Are all the rats in each cage all from the same group? If not, there are more advanced analysis routes you might be able to pursue.

Will

Topic		Replies	Views
Loss of feature with Maaslin3 MaAsLin	1	30	August 2, 2025
MaasLin3 and Covariates pFDR Downstream analysis and statistics	6	67	March 2, 2025
Maaslin2 does not retrieve any figure nor significant result MaAsLin	2	59	October 30, 2024
MaAsLin3 errors MaAsLin	9	223	March 4, 2025
Errors in masaslin3 presence/absence results MaAsLin	4	124	May 9, 2025

Maaslin3 significant feature not shown in summary plot

Related topics