Hello,
I would like to get your feedback on a Maaslin2 approach we are using to remove the effect of a specific variable (e.g. donor) in our microbiome data analysis.
In our experiments, donor stool samples are fermented in the presence of different ingredients. We assess how each ingredient impacts taxonomic and functional microbiome profiles. However, we observe a strong donor effect in the data, where samples from the same donor cluster together irrespective of the ingredient used. To address this, in our Maaslin2 analyses, we include donor information as a random effect and the ingredient as a fixed effect.
Additionally, we are interested in examining microbial abundance data with the donor effect removed. For this, we run a separate Maaslin2 analysis using only donor information as a fixed effect, without any random effects. The input data consists of raw counts with TSS normalization and LOG transformation applied (no standardization). From this analysis, we extract the residuals, back-transform them, and calculate relative abundances as follows:
# Extract residuals from Maaslin2
log_residuals <- fit_data_donor$residuals
# Step 1: Back-transform the residuals
exp_residuals <- 2^log_residuals
# Step 2: Calculate adjusted relative abundances
adjusted_relative_abundances <- sweep(exp_residuals, 2, colSums(exp_residuals), FUN = "/")
Does this approach seem valid to you? Specifically, I am uncertain about the appropriateness of back-transforming the residuals to calculate relative abundances.
I would appreciate your insights or suggestions - thank you for taking the time to review this!
Ali
Edit: a similar question was apparently asked here Using Maaslin2 function to regress out the effect of subject