Hello,
Most metagenomic shotgun sequencing pipelines provide relative abundances as output table and hence TSS may not be appropriate for normalization and CLR will be a better option.
I have a question on the “min_abundance” argument when “CLR” normalization is used.
In the manual it says: “Features with abundances more than min_abundance in more than min_prevalence of the samples will be included for analysis. The threshold is applied after normalization and before transformation.”
In case of CLR, some low relative abundance species will have negative abundance due to log transformation, what happens to those species? based on the manual they should be excluded as the min_abundance step comes after the normalization, however, when I looked at the filtered data in the output folder, I can see those species with negative CLR normalized abundances not filtered out (not assigned as NA).
This makes me question the language in the manual. I believe the statement is correct for TSS normalization, but perhaps not for CLR. Is that right?
Also, when “zero_threshold” is applied, is it on the normalized data, transformed or filtered ? As this also can impact CLR transformed data.
Another question, does the package support the use of a preprocessed and pre-transformed (such as rclr, clr, etc) relative abundance table as input? We can set normalization to “NONE” but then the “min_abundance” and “zero_threshold” seem to filterout some negative CLR transformed relative abundances, is there a way to stop those arguments when using normalization = “NONE”?
Thank you for this great package!