MaAsLin3 p/qval_joint

Hello!

I’m using your MaAsLin3 package and have read your Tutorial as well as Article. Maybe it’s a “me” problem, but could you explain how the joint p- & q-values are calculated?

I ask because when I run two models with differing covariates (one model doesn’t have BMI and the other does in shotgun sequencing data of stool samples) the effect size for abundance stays almost exactly the same but the qval_joint changes.

Thank you in advance :slight_smile:

Nicole

Hi Nicole,

The joint p-value for each association is calculated as pbeta(min(prevalence_p_value, abundance_p_value), 1, 2). The joint q-value is calculated with p.adjust(all_non_error_p_values).

Under the null that there is no abundance and no prevalence relationship for an association, the p-values for each will be uniformly distributed and independent, so the minimum of the two p-values will have a Beta(1,2) distribution. Therefore, under the null, plugging the minimum of the two p-values into the Beta(1,2) CDF yields a Uniform(0,1) random variable, which is what a p-value should have (i.e., 5% of the time it’s less than 0.05). However, if the null is false and there is either an abundance or prevalence association, one or both of the p-values will be small, so the minimum will be small, and the result from plugging into pbeta will still be small, ultimately producing a small joint p-value.

The q-values are created by false discovery rate correcting over all the non-errored prevalence and abundance p-values with the Benjamini Hochberg procedure.

Regarding your case, if the coefficient is the same, the standard error must be increasing or decreasing. If adding BMI to your model makes the standard error decrease, BMI is helping explain some of the extra variance in abundance/prevalence. If adding BMI to the model makes the standard error increase, BMI might be collinear with the other variable you care about in which case you’ll have to think about whether it makes sense to control for. I’ll also flag that if you’re using random effects with <4 observations per group and seeing this phenomenon in the prevalence modeling, it might be due to poor model fitting in which case you should install the latest maaslin3 and use small_random_effects=TRUE.

Will

1 Like