CLR normalization and min_abundance in MaAsLin3

WillNickols · March 13, 2025, 11:51am

Hi,

Thanks for using the tool!

First, I’d push back on your assertion that “TSS may not be appropriate for normalization and CLR will be a better option.” CLR is primarily used to deal with compositionality in microbiome analysis: the fact that testing for differences in relative abundance is not the same as testing for differences in absolute abundance. However, (1) MaAsLin 3 uses a median comparison that also handles relative vs. absolute tests but in a way that makes coefficients more interpretable (see here and the last comment here) and (2) sometimes people do actually care about differences in relative abundance, and then using TSS is clearly the right thing to do. In our benchmarking, we compare MaAsLin 3 against other tools such as ALDEx2 that use CLR, and we maintain better performance, even when high degrees of compositionality would cause tools that don’t specifically correct for this (e.g., MaAsLin 2) to have inflated false positives. As stated in that forum post, I’d highly encourage MaAsLin 3 users to use the TSS option with median comparison since the results are more interpretable and that’s what we’ve benchmarked.

To answer your actual questions :
If more than min_prevalence of the samples have a feature with less than min_abundance after CLR, that feature will be dropped. This doesn’t filter out specific instances of the feature having less than min_abundance, only the entire feature if it is sufficiently rare. With min_prevalence=0, as long as 1 sample has the feature with a CLR-transformed abundance above 0, it will be kept. The idea here is that some features are rare enough you just don’t care about them, but once you decide you care about a feature, you want to look at all samples that had it. However, if all CLR values are negative for a feature, it should be dropped with a min_abundance threshold of 0. In this toy example, the first all-negative column is dropped:

mat_in <- matrix(c(-1, -2, -3, 4, -5, -6, 7, -8, -9), nrow = 3, ncol = 3)
rownames(mat_in) <- c("a", "b", "c")
maaslin3::maaslin_filter(normalized_data = mat_in, 'tmp_out', min_abundance = 0)

The zero_threshold is applied before CLR on the raw data.

If you set min_abundance to -Inf, nothing should be filtered out. If you set zero_threshold to -Inf, nothing should be turned into a zero (though you’ll then have to figure out what to do with zeros yourself and maybe call the maaslin_transform and maaslin_fit functions manually). Again, I’d still recommend using the defaults since the median comparison test accounts for compositionality like CLR would, and zeros are handled directly in the prevalence model (as opposed to handling them with rclr).

Will

Topic		Replies	Views
Questions about normalization method CLR vs TSS MaAsLin	4	3144	November 11, 2022
CLR normalisation ouput Downstream analysis and statistics	7	340	January 23, 2025
Metagenomic and min_abundance filtering MaAsLin	2	711	February 14, 2023
CLR vs LOG transform MaAsLin	3	353	April 22, 2024
MaAsLin2 CLR transformation differing results MaAsLin	1	65	December 18, 2024

CLR normalization and min_abundance in MaAsLin3

Related topics