I am comparing communities from two different timepoints: baseline and week 2 - I have 40 samples pr timepoint. The MaAsLin3 analysis ran well and I added a p-value cut-off = 0.05 and the small_random_effects = TRUE parameter.
I have inspected the output significant_results.tsv file and I am in doubt if some error messages are benign or if I should adjust parameters in order to remove them. I can’t seem to find explanations in the maaslin3 tutorials, so I am asking here
The error messages are:
All logistic values are the same
Model may not have converged with 1 eigenvalue close to zero: 1.3e-10
No data points have the baseline factor level
Prevalence association possibly induced by stronger abundance association
Number of levels of each grouping factor must be < number of observations (problems: Subject)
We should likely add these to the tutorial thanks for highlighting that. In the mean time:
This usually indicates that the prevalence value that is trying to be fit is the same across all subjects (i.e. the microbe is always present in everyone).
This indicates that the software was having trouble converging on the solution with the highest likelihood. These can sometimes be ignored depending on what the data looks like and what the model diagnostic plots look like. These results should be inspected with caution.
This means that after filtering there are no data points that are in the reference value of your model and so a model cannot be fit. This can occur for a variety of reasons including during the splitting of the data during abundance and prevalence calculations.
This indicates that a significant prevalence association might actually be a strong abundance associations that was mis identified as a prevalence association due to issues with the limit of detection.
This indicates that after data filtering you have some cases where you only have one observation per random effect grouping. This can occur when the data is being split between abundance and prevalence values.
We included the error columns in the all_results for users to highlight these potential issues (many of which are bound to happen a at least a handful of times when fitting hundreds to thousands of models). If the majority of your data do not show these signs then you’re likely in an okay situation.