MaAslin2 Question about gFC calculation and interpretation

Hi, I’m using Maaslin2 to perform association analysis on microbial functional abundance data (KEGG Orthologs), and I have some questions about the gFC (geometric Fold Change) output and its interpretation.

fitdata = Maaslin2(
  input_data = ko_otu_df,           # KO functional abundance, relative abundance (RA)
  input_metadata = pd,              # Clinical metadata
  output = "maaslin2_results",
  fixed_effects = c("Group", "AGE", "SEX"),
  reference = c("Group,HC"),
  min_abundance = 0.0005,
  min_prevalence = 0.1,
  normalization = "CLR",            # Using CLR normalization
  transform = "NONE",               # No additional transformation
  analysis_method = "LM",
  cores = 10
)

My questions:

  1. With normalization="CLR" and transform="NONE", can the gFC output by Maaslin2 be interpreted as the geometric fold change of abundance between groups?
  2. If gFC = exp(coef) does not represent the traditional fold change under this setting, would you recommend switching to normalization="TSS" + transform="LOG" to obtain more interpretable fold change values?
  3. Alternatively, is there a recommended approach to calculate the fold change between groups separately (e.g., using mean or median of raw relative abundances) while keeping CLR normalization for the regression model?

Thank you very much for your help!

hi @Ray4

The coefficients that MaAsLin2 outputs represent the estimates in a general linear model. In a simple model with a single binary covariate the coefficient will represent the difference in means between the two groups after any normalization or data transformation.

In your case it looks like you are apply CLR transformation so the coefficient would represent the difference in means of CLR abundance between the groups of interest.

To get fold change you would need to either log transform your data and then do some back calculations as shown here: Trying to understand coef column (and how to convert it to fold change) - #4 by mrgambero. Logging the data allows you to more easily compute the fold change because log(a/b)= log(a) - log(b).

Hope that’s helpful.

Thanks,
Jacob

1 Like