Paired Samples with Two Groups?

Hello, I’m trying to get the formula correct for my study design and need some help! I have a dataset with two paired samples for each subject (baseline and after treatment). Some of my subjects were treated with a drug while the others received placebo. I would like to test the effect of drug treatment on the microbiome, controlling for subject. Essentially I want to see if the “deltas” for each individual microbiome within subject between baseline and after treatment are different for subjects that received placebo or the drug.

I have incorporated subject as a random effect and have time point and treatment group as fixed effects. My significance results report significant taxa for both time point and treatment group. It’s a bit difficult for me to understand what MaAsLin is doing under the hood, but the significant taxa reported for treatment group should be “taxa that significantly differ by treatment group, comparing baseline to after treatment, controlling for subject”, correct?

Hi @afvrbanac,

If I am understanding everything correctly - yes your interpretation is correct.


Hi @Kelsey_Thompson,

I’m a bit confused as when I try to test on the deltas (I subtract baseline values for each subject and then only use “Treatment” as a fixed effect in the model) I get different results. Though perhaps that is due to other reasons.

If you look at the image attached, I’m confused as to how I get coefficients with the same directionality for opposite effects. For example, in the first plot, C. jejuni is higher at baseline in the treatment group and decreases with treatment, while V. atypica has the opposite slope and also has a positive coefficient. Since these “deltas” go in opposite directions, I don’t see how they could have the same directionality of significance with relation to placebo. Do you have any insight as to what could be going on here?

Hi @afvrbanac - based on these plots, you have a significant treatment by time interaction effect (i.e., change from baseline is different between treatment and placebo groups in your specific example). In order to find these significant associations using MaAsLin2, you need to therefore supply the interaction term as another fixed effect and extract the associated coefficients to match these plots. I hope this helps!


@himel.mallick thanks for the help! It is very much appreciated. Sorry I’m new to this so just want to make sure I get it right. My reference levels are baseline and placebo. I see the example in the tutorial for interactions, but dysbiosis is continuous so I’m not sure how to mirror that with my categorical values.

Would I make a single extra column that is just TRUE or FALSE values with all baseline values set to FALSE (since that is the reference), placebo end-of-treatment set to FALSE (since placebo is the reference), and only treated end-of-treatment set to TRUE? Or is there another way I would set this up?

Hi @afvrbanac - happy to help. Although MaAsLin2 is not optimized for interactions, in this case, you can simply create three binary variables as follows:

  1. Fixed main effect 1: 1 if EOT and 0 otherwise
  2. Fixed main effect 2: 1 if Treatment and 0 otherwise
  3. Fixed interaction effect: 1 if EOT_Treatment and 0 otherwise.

Note that, the interaction term is a binary variable by definition since it’s a product of two binary variables and therefore can only take two values: 0 and 1.

With this setup, you don’t need to specify reference as by default 0 will be the reference group in each case and the interaction term is exactly what you want to estimate (i.e., difference in change from baseline to EOT between treatment and placebo). You will still need the subject as a random effect.

Note that, you need to subset your results table to the variable of interest (interaction term) and re-calculate the q-values.

Good luck with the modeling,

1 Like

Hi @himel.mallick, I am also desperately interested in find a way to deal with interaction of differents treatment groups in paired samples.

I would like to ask you if you can give some more details about the method that you described.
Can you clarify how do you create the Fixed interaction effect? what do you set as 0?
How many times you run the model in MaAsLin?
How do you construct the model?

Hope my questions are not too generic, Otherwise I can be more specific giving an example of my case.

Thanks a lot!


@Alessandro_Atzeni Not sure what you mean by more details but for the above example, the associated MaAsLin 2 run translates to the following per-feature linear mixed-effects model (in R notation):

feature ∼ (intercept) + Fixed main effect 1 + Fixed main effect 2 + Fixed interaction effect + (1|subject)

With the reference coding described above, the interaction term is interpreted as the difference in change from baseline to EOT between treatment and placebo.

Please check out Section 9.5.4 in the limma tutorial for more details on the above interaction model, which I hope, you can adapt for your case.

Many thanks,

Many thanks for your answer @himel.mallick.

I’ll try to make a specific example of my dataset. In my dataset I have patients exposed to a 1 year treatment of control diet (CD) and an intervention diet (ID) with data collected at baseline and timepoint 1 year. For example, I would like to assess the longitudinal association between variable (var 1) and microbiota in patients exposed to ID after 1 year of follow up.

I set MaAsLin model as following:

                     output = "output", 
                     fixed_effects = c("var1", "Timepoint", "treatment"),
                     random_effects = c("subject"),
                     max_significance = 0.25,
                     min_prevalence = 0,
                     normalization = "NONE",
                     transform = "NONE",
                     standardize = FALSE)

but I am not sure that is the correct way, as I think I am not considering the interaction effect of the treatment, as I have two, CD and ID.

How would you set the model for a correct implementation in this case?

Hope that my example was clear.



Hi @Alessandro_Atzeni - as a general rule, we refrain from giving project-specific guidance for which we suggest consulting with a statistician. Based on the dataset you have, MaAsLin 2 may or may not be appropriate for the particular question you are interested in.

This is because, MaAsLin 2 is not quite appropriate for all types of longitudinal analysis beyond the pre-post model I specified above, which works when you have two variables of interest (group and pre-post time) along with the interaction between them. Unfortunately, beyond these exact settings, MaAsLin 2 is not the right tool.

Best regards,

Thanks @himel.mallick. So, going back to the first example, if the aim of my study is to test the effect of Treatment at EOT, as MaAsLin does not permit to control for interactions, the suggested option is to set the fixed effect variable with 1 if EOT_Treatment and 0 otherwise, and add it as fixed effect after Timepoint (1 EOD, 0 otherwise) and intervention (1 Treatment, 0 otherwise). But now I am not sure about the interpretation of the results. Which ones are the taxa enriched or decreased? those related to the fixed effect variable or those related to the Treatment variable? Hope that my question is clear

Hi @Alessandro_Atzeni - not sure what you mean by ‘MaAsLin does not permit to control for interactions’ when EOT_Treatment is indeed the interaction variable in the above example (product of EOT and Treatment). In fact, EOT_Treatment is a binary variable that takes 1 and 0 as a result of the direct product of two binary variables. As mentioned before, if the goal is to find significant interaction effect (i.e., change from baseline is different between treatment and placebo groups in the above example), you should be looking at the associations with respect to the interaction variable.

Hope this clarifies.

Thanks a bunch,

1 Like

Thanks a lot @himel.mallick
When you say that results table has to be subset to the interaction term and the q-values re-calculated, I suppose that is the same process treated here Maaslin2 handling of covariates. So, I suppose that it wouldn’t be correct to show the q values calculated without follow this steps, correct?

That is correct.