How does Maaslin deal with zeros?

carmennns2 · December 15, 2022, 2:32pm

Hi,

I’m still relatively new to Maaslin and struggling to understand all the different arguments.

As suggested here, I chose NEGBIN, CSS, and no transformation, and no standardization for my count input. My input data contains many zeros.

fit_data = Maaslin2(
input_data = input_data,
input_metadata = input_metadata,
normalization = “CSS”,
standardize = FALSE,
transform = “NONE”,
analysis_method = “NEGBIN” ,
max_significance = 0.05,
correction = “BH”)

However, I have noticed in the significant output that Maaslin ignored all the taxas where all are 0’s.

For example, if i were to run the following example with green as my reference, it will list red as being significantly different to green, however, not blue (because its all 0’s).
ASV1 ASV2 ASV3
blue 0 0 0
red 2 1 0
green 100 80 99

Am I using the incorrect arguments?

Thank you. Any input will be greatly appreciated

Carmen

tkuntz-hsph · December 19, 2022, 8:30pm

Hi Carmen,

Maaslin by default will filter out features (taxa here) that do not reach a minimum prevalence and abundance threshold. This is in order to reduce the number of comparisons to improve power after fdr adjustment, since these features are rarely informative. This behavior can be changed by setting the min_abundance and/or min_prevalence parameters. The log file will note which features have been filtered, if any.

If a sample has zero counts across all features (including after filtering features), however, nothing will be done by the software to account for this. In some cases, these could be biologically meaningfully zeroes, in others they aren’t, but there is no way for the software to tell, so it leaves the data as is. If a feature is in the metadata but not the data (or vice versa) based on row/column names of the inputs, it will be removed however, and this can also be seen in the log file.

Hope that helps, thanks.

– Tom

carmennns2 · December 22, 2022, 3:57pm

Hi Tom,

Thanks for responding! My full code is below, where I do actually set the min_abundance and min_prevalence to 0, as I have already previously filtered single/doubletons). In this case, would it be better for me to replace all the zeros with a dummy variable, for example, 0.00001, just for the software to be able to identify it not as a zero?

Thank you and Happy Holidays!(:
Carmen

fit_data = Maaslin2(
input_data = input_data,
input_metadata = input_metadata,
normalization = “CSS”,
standardize = FALSE ,
transform = “NONE”,
analysis_method = “NEGBIN” ,
max_significance = 0.05,
output = ./Maaslin2_Species",
fixed_effects = c(“Sample_3”),
correction = “BH”,
reference = c(“Sample_3,ALGFMT”),
min_abundance = 0,
min_prevalence = 0,
heatmap = TRUE,
plot_scatter = TRUE)

tkuntz-hsph · January 5, 2023, 3:46pm

You can try to add a dummy value (probably 1 for count data is better), but I don’t see any reason you should need to.

Can you post the plots of significant results that were generated, if any?

Also, I’d recommend checking your installation of the software is up to date and perhaps running through the tutorial to make sure that you are getting the expected results there. Thanks.

carmennns2 · January 24, 2023, 3:59pm

Hi,

I didn’t think of it but adding a dummy variable is a great solution (also for other differential abundance tools).

Thanks!

Topic		Replies	Views
Discrepancy in the number of non-zero samples in input file and output for Maaslin2 MaAsLin	2	486	September 2, 2022
Maaslin2 error in min(x[x > 0])/2 MaAsLin	1	233	June 28, 2023
Questions about "--min_abudance" MaAsLin	1	61	July 12, 2024
Maaslin3 filtering out different features to Maaslin2 Downstream analysis and statistics	2	65	June 2, 2025
Log transformation pseudo-count MaAsLin	3	1846	October 24, 2023

How does Maaslin deal with zeros?

Related topics