MaAsLin3 summary plot interpretation

Hi!

I am comparing communities from two different timepoints: baseline and week 2 - I have 40 samples pr timepoint. The MaAsLin3 analysis ran well and I added a p-value cut-off = 0.001, to only get very significant results.

Now, I have some questions on how to interpret my summary plot:

I have also attached the corresponding significant_results.tsv file:

significant_results.tsv (7.8 KB)

My questions are:

  1. Why do I get so many “NA” abundances? - in my figure there are rows that only have prevalence.
  2. In the “error” column in the significant_results.tsv file, many of them say “<4 average observations per random effect group often inflates coefficients and deflates p-values: consider setting small_random_effects=TRUE and see tutorial” - would you suggest I rerun the code with the small_random_effects=TRUE option?

I am sorry if this is a stupid question - I am a novice in bioinformatics and this is my first time using MaAsLin3. :slight_smile:

Thank you!

Caroline

Hi Caroline,

  1. The NA abundances are because many of your species are present in a few samples (see the N_not_zero column). The abundance part of the model only operates on the non-zero portion of the data, and if there aren’t enough observations to fit the linear model, it gives NA.
  2. If you’re getting the warning, you should probably re-run with small_random_effects=TRUE. What’s the formula/data structure you’re using? Are you using a random effect as the ID with one observation at baseline and the other at week 2?

Will

Hi Will,

Thank you for your quick reply and help!

I will try to answer what formula/data structure I am using - it is a longitudinal study with 40 subjects, where I have taken one sample at baseline and one sample after 2 weeks with treatment, so I have two samples pr subject. I am using: formula = “~ Time + (1|Subject)” :slight_smile:

I tried to re-run MaAsLin3 with small_random_effects=TRUE, but I see no difference in the output picture (I don’t know if there should be?):

When I look at the new significant_results.tsv file, there are still a lot of rows that gets the error “<4 average observations per random effect group often inflates coefficients and deflates p-values: consider setting small_random_effects=TRUE and see tutorial”. I only see small changes in the qval_individual. Does the small_random_effects=TRUE work or do I need to do something else?

significant_results_new.tsv (7.8 KB)

Caroline

Hmm, would you mind sending the entire maaslin3 command you’re running? It seems like small_random_effects is not working for some reason.

Not at all! Yesterday I did not implement small_random_effects correctly, so that’s why I did not see a difference.

I am using maaslin3 through an R function with other DA tools - these are screenshots with the maaslin3 parts:

And this is my command where I use the function:

ps_up %>%
  subset_samples(Sample_Type == "Plaque" &
                 Time_new %in% c("Baseline", "Week 2")) %>%
  phyloseq_diff_abundance(ps_tmp = .,
                          approach = "maaslin3",
                          glom = "Species",
                          taxa_rank = "Species",
                          comp_group = "Time", 
                          palette = time_pal, 
                          masslin3_output_dir = "/Volumes/lnp524/PhD/Puplications/2025/Metagenomic_characterization/04_DA/Taxonomy/Plaque/Baseline_vs_week2/New",
                          maaslin3_lefse_plot = TRUE,
                          pvalue_cutoff = 0.001,
                          #maaslin_summary_plot_first_n = 14, # Run first time without this command, then count how many significant species are there, and then set this number.
                          small_random_effects = TRUE,
                          ) -> maaslin_plaque_tax_tp12

I am getting this error, which I do not understand:

Error in phyloseq_maaslin3(., formula = maaslin3_formula, fixed_effects = maaslin3_fixed_effects,  : 
  unused argument (small_random_effects = small_random_effects)
Error during wrapup: not that many frames on the stack
Error: no more error handlers available (recursive errors?); invoking 'abort' restart

Thank you so much!

Caroline

I think I see the problem now. I am using the phyloseq_maaslin3() wrapper as my input is a phyloseq object, but small_random_effects is not part of that wrapper - I will add it.

I fixed it and changed the p-value cut-off to 0.05, and now I get this figure:

And this significant_results.tsv file.

significant_results.tsv (6.9 KB)

I am doing more pairwise comparisons (6 in total), always with 40 samples in each group. Would you suggest that I add small_random_effects = TRUE to all the comparisons, or does it depend on the error messages for each comparison when small_random_effects = FALSE?

Thank you again for your help! :slight_smile:

Glad that worked out - those look like reasonable effect sizes and significance values for a moderate abundance effect based on your dataset size.

I’d add small_random_effects=TRUE to all of them as long as you’re still including a random effect. Also, I’m not sure exactly how everything maps from phyloseq_maaslin3 to actual maaslin3, but it’s worth checking that your strata effects are what you think they are. We have a strata option in MaAsLin 3, but understanding what it’s doing is pretty advanced and I would rarely recommend it.