Hi,
I work with the respiratory microbiome, which differs from many simulated datasets. My samples often have extremely low alpha diversity, dominated by one or a few ASVs. These dominant ASVs have a very high prevalence.
A small proportion of some positive controls - and most negative controls - contain ASVs from biological samples, suggesting cross-contamination. Cross-contamination is primarily from high-abundance ASVs. It may therefore be sensible for prevalence cutoffs to reflect this dynamic rather than being uniform.
In low-diversity, dominant-taxa settings, log transformations for variance stabilization may not be optimal:
- High relative abundance of dominant ASVs are not outliers but common and may not need to be stabilized.
- Biological associations for high-abundance ASVs (for example, with local immune markers) appear to be on an additive rather than a multiplicative scale.
- In my experience, log-transformation amplifies signals in low-abundance taxa while dampening those from dominant taxa.
It would be nice to have the option to avoid the log transform entirely. A quasi-Poisson approach could be an alternative - relative abundances could be scaled by a large number and rounded to integers.
I’m considering forking the package and implementing this. Perhaps some of these reflections are of interest. I would greatly appreciate any input.
Best regards,
Anton