The issue of comparing results after recalculating q-values in Maaslin2

Hello, look here! MaAsLin2 is a great tool, but I’ve encountered some issues:

I have several covariates and a continuous variable A, and I want to find the relationship between variable A and microbiome features using a model like Taxa ~ LM (variable A + covariate1 + covariate2 + covariate3…). After obtaining all the results and filtering them based on a maximum significance threshold of q ≤ 0.25, I found 136 features significantly associated with variable A (all_results.tsv). Following some advice, I subset the final MaAsLin2 results table to focus on the main effects of variable A and re-computed the q-values to detect significant microbiome features. I found that only 20 features met the q ≤ 0.25 threshold after this re-computation. Here’s my code:
maas.result = Maaslin2(
input_data = taxonomy,
input_metadata = sample_metadata,
output = ‘output’,
min_abundance = 0,
min_prevalence = 0,
max_significance = 0.25,
normalization = ‘TSS’,
transform = ‘LOG’, #AST
fixed_effects = c(“Breed”,“Strain”,“Age”,“Batch”,“A”),
reference=c(“Breed,Y”),
standardize = T,
plot_heatmap = F,
plot_scatter = F)

Question 1: Which results should I use?
Question 2: Is it necessary to re-compute the q-values?
Question 3: If I have other variables, should I analyze them together in the same model or construct separate models?
fixed_effects = c(“variable A”, “variable B”,“variable C”,“covariate")
or
fixed_effects = c(“variable A”, “covariate“)
fixed_effects = c(“variable B”, “covariate“)
fixed_effects = c(“variable C”, “covariate“)

Hi!

  1. It’s somewhat a philosophical question which you should use. When you correct over everything, you’re saying that <25% of all significant associations (with any covariates) will be false discoveries. When you correct over only the covariate of interest, you’re saying that <25% of the significant associations with that covariate will be false discoveries. The fact that you have this phenomenon in the first place suggests that some of your other covariates explain much of the variance even though they aren’t of primary interest.
  2. Recomputing depends on your answer to the philosophical question in (1).
  3. If you think all the variables could matter at the same time, they should be included in the same model. If you have Breed, Strain, Age, Batch, and A in the same model, your results for “A” can be interpreted as “Holding Breed, Strain, Age, and Batch constant, ### is the association between A and the microbial abundance.” If you only include those one at a time, you’ll only be able to make statements like: “Holding Breed constant, ### is the association between A and the microbial abundance (but it might still be confounded by Strain, Age, and Batch).”

Will