Pairwise comparisons with multi-level categorical variable in MaAsLin3

I am new to MaAsLin3 and would like to ask about the recommended approach for pairwise comparisons. In my study, the variable status has three levels: H. pylori (+), post-eradicated, and H. pylori (–).

After running MaAsLin3, I attempted to perform contrast tests, but I encountered the following error: Predictors not in the model had non-zero contrast values

Could you please advise which analysis method I should use to identify differential features between: post-eradicated vs. H. pylori (+), post-eradicated vs. H. pylori (–), H. pylori (+) vs. H. pylori (–)

Any guidance or examples would be greatly appreciated.

Hi,

We added the contrast test for cases where people have a bunch of levels to compare against each other, but in your case it’s probably most straightforward to run 2 MaAsLin 3 models: one with post-eradicated as the baseline (this will give you coefficients for eradicated vs. + and eradicated vs. -) and one with - as the baseline (this will give you the coefficient for - vs. +). Then, you can use p.adjust to FDR correct over those 3 sets of p-values, which is the same as what the contrast test would’ve done.

Let me know if that makes sense.

Will

I will try it.

Really thanks

Hi Will,
I feel like my question is on similar lines - I have a 4-factor metadata, which is the only effect I am applying to my data. I want to figure out which features are strongly affected by the different factors. To do this, I’ve run MaAsLin2 with each of the 4 factors as the reference and then filtered out the significant associations (qval <0.2). I compared this output to a pairwise comparison between each of the 4 factors and found the latter had a drastically lower number of significant associations. One reason could be that q values are likely to increase with a lower sample size, but I wanted to know if there is some other reason, and if there is a way to run the multi-factor model with no reference and have all the factors compared to each other?

Thank you so much!

I think there are a few factors that could be causing issues here.

First, regardless of whether you’re evaluating 4 different references or taking all pairwise combinations (or anything else), you should always get the full set of p-values for whatever you’re testing and then p.adjust those. Doing anything else like adjusting the p-values per-reference or per-pair and then combining them will not properly control the FDR.

Second, if you’re evaluating 4 different references, you’re taking the full sample size each time, whereas if you’re using pairwise combinations, you’re using a sample size equal to the total samples in that pair of conditions. If your only covariate is the group, you’ll get the same coefficient whether you take the pairwise or reference approach, but you’ll (probably) get lower standard errors if you use the larger sample size of the reference approach.

I see, thank you! That makes sense :smiley: