MaAsLin2 and microbiota development over time

I am a post-doc working on respiratory (nasopharyngeal) microbiota development. I was highly interested when reading up on this new version of MaAsLin, given it provides options to appropriately model repeated measures, which is extremely important when studying microbial succession over time.
Until recently, I used the metagenomeSeq::fitTimeSeries-function for these analyses. By prefiltering features, I hope I was able to control false-positive detection to a certain degree. I would be very interested in (re)running my models using MaAsLin2, yet I need advice on two problems I encounter.

  1. I find that some microbiota show unimodal/non-linear, rather than linear patterns over time (for example: Staphylococcus typically sharply increases just after birth, peaking at ~1month, after which it decreases in abundance). How would you model these non-linear trends in MaAsLin (is that even possible? Should I include splines, use a different model than ā€œLMā€)?
  2. The advantage of metagenomeSeq::fitTimeSeries-function (which is spline-based, focussed on the (covariate-adjusted) differential abundance between 2 groups over time), is that it seems to cater non-linearity over time AND gives some indication within what time period microbial abundance is different. Is that also possible with MaAsLin?

Any pointers/ideas/advice is very welcome.

PS: I really enjoyed reading the preprint on BioRxiv and am highly impressed by the amount of benchmarking done (also using other packages). This is a very minor detail, but do I understand correctly that the term ā€˜univariateā€™ is used to refer to ā€˜univariableā€™ models (it is slightly unclear combined with the use of the term ā€˜multivariableā€™)?

1 Like

Hi @wsteenhu - thanks for sharing your feedback on the preprint very much appreciated. As mentioned in the manuscript, MaAsLin 2 is primarily designed for finding associations in non-longitudinal studies, those with a small number of repeated measures (e.g. multiple tissues or families), or from sparse and irregular longitudinal data from many subjects (e.g. with an unequal number of repeated measurements per subject), as commonly encountered in population-scale epidemiology studies such as the iHMP.

Having said that, we agree that the aspect of temporal trend modeling, especially in the context of highly dense time-series data from a few subjects, will be a useful addition to MaAsLin 2ā€™s capabilities which we may consider in the future release. These types of analyses are better-handled by more specialized models, such as MDSINE, MITRE, or others (reviewed in arXiv:1805.04591) and due to their significant deviation from typical parametric models, we did not include those in the manuscript or the software.

Regarding your specific analysis questions, one can supply externally constructed splines or some variant of the time variable in the metadata for detecting time-varying/non-linear effects. However, we donā€™t have a functionality yet for other types of longitudinal analyses such as the detection of a suitable time period or longitudinal trajectories.

As an aside, MaAsLin 2 outputs the predicted values of the random effects (along with the fitted values and the residuals) from the longitudinal mixed-effects model which represents individual-level trajectories and can be handy for specific prediction tasks and follow-up analyses (e.g. detection of covariate-adjusted and de-trended associations as done here). Just wanted to point out as they are generally hidden behind the significance table in the output folder :slight_smile:

Apologies - missed your last clarification question. We previously attempted to explain this in a review paper. Check out Box 4 and let me know if that clarifies your question.

Dear dr. Mallick, thank you very much for your extensive answer and great pointers. I did consider MITRE in the past, yet will give it a closer look once more. Would be great if MaAsLin2 at some point includes more options for longitudinal data, yet I believe it is already an important addition to the toolset we had available. Thanks again.

Hi,

I wanted to know if the Maaslin 2 package would run for longitudinal data at 3-time points ( in my case gestational weeks of pregnant women are uneven and are in a particular range of 10-28 weeks for visit1, 29-39 for visit 2, 3-16 weeks postpartum). But three measurements are obtained for each subject.

Hi @akanksha_30 - yes, you should be able to use MaAsLin 2 for your analysis. Thanks!