I am working with longitudinal microbiome data of eight visits. For samples collected at each visit I have both microbiome profiles (taxon relative abundances) and an Outcome variable (Yes/No).
I want to test a simple lagged association: Is the prevalence or relative abundance of taxa at preceding visit (t) associated with outcome at next visit (t+1)?
If I create a new variable of my outcome (next_visit_Outcome), can MaAsLin3 appropriately model this lagged relationship or is there any other statistical nuance that should be considered? The formula I had used to test for association between taxa and outcome of samples at the same time point is below. Is it appropriate to change ‘Outcome’ in the formula to ‘next_visit_Outcome’
fit_out ← maaslin3(input_data = abundance_data_df,
input_metadata = metadata_maaslin
output = ‘MAASLIN_3’,
formula = ‘~ Outcome + sn + Age + read_depth + (1|q3_record_id)’,
normalization = ‘CLR’,
transform = ‘NONE’,
warn_prevalence = FALSE,
augment = TRUE,
standardize = TRUE,
plot_summary_plot = TRUE,
median_comparison_abundance = TRUE,
median_comparison_prevalence = FALSE,
summary_plot_first = 50
)