Hi,
I am running Maaslin3 with my data and receive the error in my significant results that <4 average observations per random effect group influence… If I use small sample correction, the significant results disappear. Can I still conclude something about my significant results or is this not trustworthy at all?
My design consists of participants divided in two groups (group) taking samples at 4 different timepoints (week). I thus included group and week as fixed effect and participant as random effect.
fit_genus ← maaslin3(
input_data = feature_genus_mat,
input_metadata = meta_sub,
output = “maaslin3_genus_pretreated”,
fixed_effects = c(“group”, “week”),
random_effects = c(“participant”),
normalization = “TSS”,
transform = “LOG”,
small_random_effects = TRUE,
augment = TRUE,
standardize = TRUE,
max_significance = 0.1,
correction = “BH”,
median_comparison_abundance = TRUE,
median_comparison_prevalence = FALSE
)
Besides I was also wondering if I should include the read depth as a factor?
Kind regards,
Silke
Hi Silke,
Regarding your second question, I would always include read depth to ensure that isn’t a confounder. Regarding your first question, how many samples per person do you have and how many groups, time points, and samples total do you have? Depending on the scale of your data, different options might make sense.
Will
1 Like
Hi Will,
I have 4 samples per participant (one sample at each time point), 2 groups, 4 timepoints and 62 participants. Thus in total 4*62 samples = 248 samples.
There is no treatment, we just compare people living in two different regions and we take samples at 4 timepoints to also look at the difference over time.
Kind regards,
Silke
Ah, if you have 2 regions and people don’t move between the regions, the subject IDs as fixed intercepts with small_random_effects=TRUE would be collinear with the region and therefore would probably have no statistical significance (if the variable is retained at all).
We recently (today) made an update to allow bypassing the small random effects warning. I’d run remotes::install_github("biobakery/maaslin3") to pull the new version and run with bypass_small_group_warning=TRUE. If you check the results and there aren’t any associations with coefficients over 10 in absolute value with p-values under 10^-10, it’s probably fine. The small random effects system is mainly to prevent cases where you have associations that are very large in magnitude but have tiny p-values due to highly constrained fits.
Will
Hi Will,
Thank you very much for your answer. I will use the bypass_small_group_warning=TRUE.
I still have some small questions:
- If I run Maaslin on genera or family data: is it best to first change all the NA to unassigned? Is it best to first remove OTUs present in only one sample, which could be seen as noise or do I just work with the raw data in Maaslin?
- I thus have read depth, week, group and participant as effects in my formula, how many effects can I still add? Because I would like to add information like gender, age, smoking, health status, … Do I just perform separate Maaslin analyses in which I add one different effect every time and see if my significant results change? Because I think adding them all would create too many effects in my model?
- Now Maaslin compares the timepoints with the first timepoint, is there a way to compare the different timepoints with each other as well?
- When is it interesting to use the group wise differences?
Thank you very much
Kind regards,
Silke