MaAsLin 3: Forcing X-axis variable for Interaction Plots (Diagnosis * Time)

Hi MaAsLin Team,

I am running a longitudinal analysis using MaAsLin 3 on relative abundances obtained via MetaPhlAn. My goal is to look for differential changes in the microbiome leading up to disease conversion, while adjusting for age.

My Model: I am using the following formula to test the interaction between Diagnosis and Relative_timepoint (Time-to-conversion), while adjusting for timepoint_numeric (Subject Age):

formula = '~ Diagnosis * Relative_timepoint + timepoint_numeric'

  • Diagnosis: Factor (Case/Control)

  • Relative_timepoint: Numeric (Negative months counting up to conversion at 0)

  • timepoint_numeric: Numeric (Subject age in months)

The Issue: I have significant q-values (< 0.1) for the interaction term (Diagnosis:Relative_timepoint) in my all_results.tsv. However, the automatic output plots are not generating figures for this interaction. For example, when I navigate to the figures/assciation_plots/ directory, there are only timepoint_numeric plots.

It seems MaAsLin is prioritizing timepoint_numeric (Age) for the X-axis or skipping the interaction plot entirely, maybe because there are two continuous time variables in the model?

My Question: Is there a parameter in maaslin3 or maaslin_plot_results_from_output that allows me to force the plotter to use Relative_timepoint as the X-axis variable or to generate plots for both timepoint_numeric & Relative_timepoint?

I include timepoint_numeric because it is necessary to control for the effects of development on the microbiome in this disease context, but our scientific interest lies in the differences approaching conversion along the Relative_timepoint timeline. Ideally, there would be a way to do this and avoid manually recreating with ggplot, however I’m fine to do that if necessary. If that is what it takes, where might I start with this process?

Thanks!

Reference code:

msln3_output_res <- maaslin3(
    input_data = species_mpa_relab_filt, 
    input_metadata = maasalin_metadata, 
    output = 'mpa_msln3_res',
    formula = '~ Diagnosis * Relative_timepoint + timepoint_numeric', 
    normalization = 'NONE', 
    transform = 'LOG',      
    # Set to 0 because I have already filtered the data.
    min_abundance = 0,      
    min_prevalence = 0,     
    standardize = FALSE,    
    augment = TRUE,         
    cores = 60
)

Hi,

We only plot the base variables (not interaction terms) because the interaction terms can get arbitrarily complicated and plotting those in generality would require extra case checking for something we didn’t think people would use that often.

With that said, there should be plots produced for both timepoint_numeric and Relative_timepoint (individually) if those variables actually had significant (individual) effects. Just to check, are there actually significant individual (not interactive) effects with both of those variables, and are they within the number of individual association plots produced (by default, the top 30 associations)? If not, but that’s what you need, you might try bumping up max_pngs to 10000 to make sure you’re getting plots for all the significant individual associations.

If you need to plot the interaction, I think you want a plot where you have abundance or prevalence for each feature on the y axis, Relative_timepoint (or timepoint_numeric) on the x axis, and the other one as the color of your points. Feel free to use any of the code in the MaAsLin 3 visualization file, and it’s probably easiest to just dump it into an LLM and ask it to make an interaction version.

Will