I’m analyzing piglet microbiotas from a study with 3 experimental groups, each with ~25 piglets from ~6 dams. I’m trying to use dam parity (6 levels) and room (2 levels) as additional effects (as they probably affect the microbiotas, although we aren’t specifically interested in these effects). If I use them as fixed effects, all differential abundances between groups become MORE significant. But if I use them as random effects (which is probably the right thing to do in the context), most of them become LESS significant. Could this be because I have too many levels for such a small N? How could I evaluate which result is statistically more solid?
Hi,
Using random effects should almost always be more powerful than using fixed effects per group since the fixed effects per group are a more general model with fewer assumptions. However, it’s probably not impossible to find scenarios in which this isn’t the case. I would check both sets of outputs for errors and make sure that all your fixed effect levels actually showed up in your results. You can also look through the diagnostic plots for significant associations and check that things seem visually correct. However, if that’s all fine, I’d tend towards the fixed effects per group since that model makes fewer assumptions.
If you want to post a few lines of each results table here to sanity check things, I could take a look.
Will
Thanks!
Please find a screenshot attached. Top table using fixed effects and bottom table using random effects. Sorry for cropping the feature names, it’s unpublished data (but you can still see which is which in the 2nd table). As you see, the 1st taxon is more significant with random effects but others are less significant.
Maaslin3 script below. I didn’t see any errors in the log.
Our statisticians (who know nothing about Maaslin) say these variables MUST definitely be used as random effects, as they are clustering the animals (in the case of room, in very concrete terms :)). But you seem to imply it’s not quite that black and white? And I think they could be thought as fixed effects as it’s kind of reasonable to expect that they directly affect the microbiota compositions.
I didn’t fully understand though what you mean by fixed effects per group?
fit_out ← maaslin3(input_data = data.frame(otu_table(ps_analy)),
input_metadata = meta_data,
output = output_folder,
formula = ‘~exp_group + reads + (1|parity)+(1|room)’,
normalization = ‘TSS’,
transform = ‘LOG’,
augment = TRUE,
standardize = TRUE,
max_significance = 0.1,
min_abundance = 0.001,
min_prevalence = 0.05,
median_comparison_abundance = FALSE,
median_comparison_prevalence = FALSE,
max_pngs = 100,
cores = 8,
save_models = TRUE)
These tables both seem reasonable. You might check that you’re indeed expecting no feature zeros in your dataset (this is very uncommon without some sort of imputation), but otherwise thing seem fine.
Your statisticians are right that the grouping variables must be included as some sort of effects, but whether they’re random effects or fixed effects is up to you. Usually, grouping variables we don’t inherently care about are treated as random effects since this provides some more power while making pretty light assumptions about the distribution of the per-group intercepts. However, using fixed effects for your grouping variable instead still gives you a per-group intercept, just without the assumption about the distribution of those intercepts. Also, just to be clear, that model would be ~exp_group + reads + parity + room
, not ~exp_group + reads
, so you’re still accounting for the grouping like in the random intercepts case, just with fixed intercepts.
Will
Many thanks for the comment! Yes, I was using them as fixed effect just as you instruct here.