How to create a nested random effect in maaslin3

young_doktor · February 9, 2025, 6:35pm

I am trying to create a nested random effect in maaslin3. I have looked up how to do this in lme4. I have tried the following in my formula:

(1|donor:patient)
which yields: Error in [.data.frame(metadata, , names(1)) :
undefined columns selected
(1|donor/patient)
which yields: Error in donor/patient : non-numeric argument to binary operator

I am putting these in my formula. donor and patient are columns in my metadata table.

I am able to get the model to run when I put it as (1|donor) + (1|patient) + and other fixed effects. Per lme4 this is technically a way to write a crossed random effect (if each patient may have not gotten the same donor). In my study each patient got the same donor.

Is there a way to make this nested in maaslin3?

Thank you for your help! Maaslin3 is awesome and I love the graphical output!

WillNickols · February 9, 2025, 8:12pm

Hi,

There are a number of pre-processing steps that happen internally in order to check the metadata and make sure the results table displays the right information, so that’s probably what’s throwing the error with your first two formulas.

The final formula (crossed effects) is what we used in the MaAsLin 2 paper even though that was actually a nested case (subjects within a recruitment center), so if you want to use that, there is some precedent.

I’ll note that, unless you have an enormous amount of data, adding in multiple random effects (especially if they’re nested) is probably going to cause quite a few poor model fits and not improve your validity much. If you look at the math under the hood with random effects, they’re the same as per-individual fixed effects with a little bit of shrinkage towards a normal distribution. You might think that adding in a second layer would control for differences between groups of individuals, but if the individuals are completely nested, those effects would be naturally reabsorbed in the individual effect without the second layer. (Imagine you were using fixed effects rather than random effects throughout; these second layer intercepts would be unidentifiable/linearly dependent with the individual effects.) To me, assuming that (1) each individual has a specific effect and a group-specific effect such that both of these follow a normal distribution seems no more likely than (2) each individual has a specific effect (that already sums both the individual and supposed group effect) and that these effects follow a normal distribution.

In short, I’d use the most nested level (probably patient in your case) as the only random intercept.

Will

young_doktor · February 9, 2025, 10:01pm

Dear @WillNickols ,

Thank you for the very good explanation and quick response. I understand your explanation, and I will give it a try with just the ‘patient’ as a random intercept.

One more quick question, in the figure maaslin3 generates (summary.plot), how does it decide which fixed effects are the two to choose to display? Are they the two fixed effects that are the most significantly associated? Thank you for your continued help with troubleshooting.

WillNickols · February 10, 2025, 1:30am

Yes - they are chosen based on which variables have the most significant associations. You can change them with coef_plot_vars.

Will

young_doktor · February 10, 2025, 10:44pm

Oh, amazing! Thank you for that trick!

young_doktor · February 11, 2025, 3:43pm

Hi @WillNickols

I am trying to use coef_plot_vars to specify which two fixed effects I want to include in my plot (with abundance and prevalence). This is my structure for the command: coef_plot_vars = c(“variable1 yes”, “variable2 yes”)

Does this go in the code with the forumul under “fit_out = …” or is this run only under this command: maaslin_plot_results_from_output(“foldername”,
coef_plot_vars = c(“variable1 yes”, “variable2 yes”),
metadata = metadata)?

I’ve tried getting it to work both ways, but when I get the maaslin_plot_results_from_output to run, I see a bunch of graphs generated in R, but no summary.plot. Any thoughts on how to structure this correctly? I’d also like to specify which fixed effects end up under the heatmap for heatmap_vars in the summary.plot figure. If you have any code available that you’ve used before so I can see how to structure my code, that would be extremely helpful and valuable! Thank you!

WillNickols · February 11, 2025, 4:46pm

Does the example in the Replotting section of the vignette have what you’re looking for? The coef_plot_vars and heatmap_vars can go in either maaslin3 or maaslin_plot_results_from_output.

Will

young_doktor · February 12, 2025, 8:16pm

Dear @WillNickols

Thank you for pointing me towards that resource. I tried to model my code similarly, but I am unable to get it to plot the summary_plot when I specify heatmap_vars and coef_plot vars. When I specify these, I do get the association_plots but not the summary_plot.

Here is my code:

fit_out ← maaslin3(
input_data = species,
input_metadata = metadata,
output = ‘new_output’,
formula = ‘~ variable1 + variable2 + variable3 + variable4 + variable5 + variable6 + variable7 + variable8 + reads’,
plot_summary_plot = TRUE,
plot_associations = TRUE,
heatmap_vars = c(‘variable1’,‘variable2’,‘variable4’,‘variable6’,‘variable7’,‘variable8’),
coef_plot_vars = c(‘variable1 yes’,‘variable2 yes’),
max_pngs = 30)

I’ve tried blocking out either the heatmap_vars and coef_plot_vars one at a time. The above only generates the summary_plot when both the heatmap_vars and coef_plot_vars are simultaneously blocked out.

Any suggestions?

I’ve also tried running it like this:
scatter_plots ← maaslin_plot_results_from_output(
…

WillNickols · February 12, 2025, 9:40pm

Are all your variables continuous/binary variables? Based on the coef_plot_vars, it looks like they are categorial with levels no and yes, so you’d need to make heatmap_vars match the coef_plot_vars style you have.

Are at least some of your associations significant? If not, it might be that there are no significant results to plot.

If neither of these solve it, could you post a minimal example (taxonomic table, metadata, and maaslin3 command) or send me an email at willnickols@g.harvard.edu? You can change the metadata and taxa names if you’d prefer to anonymize it.

Will

Topic		Replies	Views
Nested Random Effects in MaAsLin MaAsLin	5	407	August 10, 2024
Controlling for nested matched controls MaAsLin	1	40	September 3, 2024
Nested factors in MaAsLin2 MaAsLin	1	150	February 5, 2024
Random effects of cohort/round MaAsLin	1	397	January 16, 2023
Fitting problem only when using random effects MaAsLin	3	760	May 2, 2022

How to create a nested random effect in maaslin3

Related topics