Hi
There are some previous topics talking about the longitudinal analysis issue
But I found the discussion focused on the sparse time point
So I’m wondering if dense longitudinal time-series data works in MaAslin3
I have a dataset with three bioreactors that were operated stably for 364 days (54 weeks) under the same conditions, and weekly data are collected. (3 reactors * 54 weeks = 162 samples)
In the 24th week, I did a treatment of the system (continuous to the end of the experiment)
I want to know the effect of treatment on the microbial composition in the system
When we deal with time-series data, the time factor would be an issue, since the sample between two sample points are associated with each other and abundance of a taxa could vary even if we didn’t do any treatment (eg, the abundance of a taxa might increase in the first 10 weeks but somehow decrease in 11-24 weeks, then increase again after 24 weeks). We call it nonlinear behavior.
Is it feasible to use time as a factor to assess the effect of treatment?
Like what taxa’s abundance increases due to the treatment
Here is my proposal running
test = maaslin3(
input_data = df_input_data,
input_metadata = df_input_metadata,
output = ‘maaslin3.test’, # Output directory
formula = '~ Treatment + Operation day +(1|Reactor) ', #since the data from same reactor are non-independence
normalization = ‘NONE’, # Input is already in abundance units; no additional normalization is applied.
transform = ‘LOG’, # Log2 transform the feature values
reference = c(‘Treatment,before’), # Set the reference level of the factor ‘Treatment ’ to ‘before’
augment = TRUE, # Add low-weighted 0s/1s to reduce issues from (quasi-)separation in prevalence models
standardize = TRUE, # Z-score standardize continuous metadata variables (e.g., Day) for comparable scales
max_significance = 0.1, # Significance threshold (q-value/FDR cutoff) for reporting results
median_comparison_abundance = TRUE, # Use median comparison for abundance model coefficients (recommended for relative abundance)
median_comparison_prevalence = FALSE, # Do not use median comparison for prevalence model coefficients
max_pngs = 250, # Plot the top max_pngs significant associations
cores = 1 # Number of CPU cores to use
)
Thank you in advance
Chao