I am running Maaslin2 with a multivariate linear model to investigate the effect of variable A (variable of interest) on microbiome features, while correcting for other covariates. I used the fixed effects model.
Taxa ~ LM (variable A + covariate1 + covariate2+…+ covariate3)
When I got the results, firstly I picked up those results belonging to variable A only from all results. Thus I got a smaller subset of results with pval and qval.
Then I re-calculate q value based on the smaller subset of results, and got a new qval by using the code: qval=p.adjust(results_of_interest$pval, method=“BH”).
However the new qval seems even larger than original qval. In theory the new qval should be smaller as the multiple testing number is smaller. It seems strange here.
Is something wrong?
Hi @Rowling ,
While in general you are more powered when you have less test cases this does not always lead to lower FDR correct values. This is because FDR correction is trying to control the false discovery rate at some specific q-value (i.e. q=0.05 means ~5% of significant results will be false positives). You can check out how the Benjamini-Hochberh procedure works here False discovery rate - Wikipedia
That would be my best guess as to why your q-values are higher in the second case.
Thanks for the answer.
While I am also confused about the necessity to do another correction based on subset of tests. For example, when we run a multivariate linear model, you would get p values for all covariates at one run/test, right? That is to say, although we can only extract p values of variable of interest, however that doesn’t mean you run lower number of tests. The total tests number didn’t change. So multiple testing correction also didn’t change.
Is this explainable?