Association between microbiome feature and next timepoint outcome

I am working with longitudinal microbiome data of eight visits. For samples collected at each visit I have both microbiome profiles (taxon relative abundances) and an Outcome variable (Yes/No).

I want to test a simple lagged association: Is the prevalence or relative abundance of taxa at preceding visit (t) associated with outcome at next visit (t+1)?

If I create a new variable of my outcome (next_visit_Outcome), can MaAsLin3 appropriately model this lagged relationship or is there any other statistical nuance that should be considered? The formula I had used to test for association between taxa and outcome of samples at the same time point is below. Is it appropriate to change ‘Outcome’ in the formula to ‘next_visit_Outcome’

fit_out ← maaslin3(input_data = abundance_data_df,
input_metadata = metadata_maaslin
output = ‘MAASLIN_3’,
formula = ‘~ Outcome + sn + Age + read_depth + (1|q3_record_id)’,
normalization = ‘CLR’,
transform = ‘NONE’,
warn_prevalence = FALSE,
augment = TRUE,
standardize = TRUE,
plot_summary_plot = TRUE,
median_comparison_abundance = TRUE,
median_comparison_prevalence = FALSE,
summary_plot_first = 50
)

1 Like

Hi @Gascoigne ,

If I understand correctly you essentially want to look a the microbiome profile and associate with a future outcome in the study. If that is the case I think your set up sounds fine. I would make sure that you control for the repeated measures which it looks like you did (assuming q3_record_id, is participant id etc.) and also maybe consider controlling for timepoints if you think that is relevant to the outcome of your study.

Cheers,
Jacob Nearing

2 Likes