Test for Sequencing Depth in interaction with other variables

Dear MaAsLin2 developpers and users,

Is there any way to control or at least test the effect of sequencing depth to make sure observed differences are not resulting from underlining differences in sequencing depth between groups/samples.

I typically do that using anova in order to check if sequencing depth explain differences in alpha diversity e.g., alpha_metric ~ Group * LibrairySize and i can confirm if Librairy Size has an effect alone or depending ou the Group I am testing.

I have added the number of reads I subbmitted to humann2 in my metadata and when running MaAsLin2 I detect and see strong significant correlations, but is there any syntax to check for Sequencing Depth as an interaction factor to other factor tested through MaAsLin2?

Thanks a ton.


Hi @fconstancias - this is doable in MaAsLin2 if you can create a metadata table with the desired sequencing depth variable as one of the columns. Unfortunately, MaAsLin2 does not support specifying an interaction directly but you can always modify your metadata table using the model.matrix() function (https://genomicsclass.github.io/book/pages/expressing_design_formula.html) and add an interaction variable before supplying it to MaAsLin2 as input. Does it make sense?

Hi @himel.mallick,

Thanks for your answer.

Then I am a bit confused how MaAsLin2 actually works, if it does not take interaction into account how this differs from univariate tests?

I am working on Oral microbiome dataset and I have two main factor I am interested in : health_status (Health vs Disease) and Oral Site (Plaque, Tongue, Saliva).

Wich strategy/syntax should I use with MaAsLin2 in order to know:
i) which patways differ in term of abundance between Health and Disease at every Site
ii) which patways differ in term of abundance between Sites and then as there are three levels is any post-hoc test performed to test which site differs from the others?
iii) which patways differ in term of abundance between Health and Disease at the different Site.

Thanks a ton.


Hi @fconstancias - as I mentioned in my last reply, you need to create an interaction variable (health_status*Oral Site) before supplying the data to MaAsLin2 and test for the effects of health_status, Oral Site, and your newly created interaction variable (i.e. health_status*Oral Site). You can specify the names of the variables you want to test for in the fixed_effects part of the function. This is unlike some other R packages where you can specify something like y ~ health_status + Oral Site + health_status*Oral Site in the form of a formula.

Thanks a lot.
A an output I only obtained significant features (Patways) for the interaction variable.
This means interaction is always significant, is that correct?

Then I run twice MaAsLin2 testing Oral Site as a fixed effect, and health_status to see the overall Oral Site , health_status effect and got some differentially abundant patways.

  • is there any easy way to see pairwise comparaisons? Since I am working on three Oral Site I would like to know which sites are different/ similar.

  • In order to see differences explained by health_status in the different oral sites, I have to subset the data from ech site and run MaAsLin2 with health_status as a fixed effect.

Does it make sense?

Thanks again.