How does Maaslin deal with zeros?

Hi,

I’m still relatively new to Maaslin and struggling to understand all the different arguments.

As suggested here, I chose NEGBIN, CSS, and no transformation, and no standardization for my count input. My input data contains many zeros.

fit_data = Maaslin2(
input_data = input_data,
input_metadata = input_metadata,
normalization = “CSS”,
standardize = FALSE,
transform = “NONE”,
analysis_method = “NEGBIN” ,
max_significance = 0.05,
correction = “BH”)

However, I have noticed in the significant output that Maaslin ignored all the taxas where all are 0’s.

For example, if i were to run the following example with green as my reference, it will list red as being significantly different to green, however, not blue (because its all 0’s).
ASV1 ASV2 ASV3
blue 0 0 0
red 2 1 0
green 100 80 99

Am I using the incorrect arguments?

Thank you. Any input will be greatly appreciated :slight_smile:

Carmen

Hi Carmen,

Maaslin by default will filter out features (taxa here) that do not reach a minimum prevalence and abundance threshold. This is in order to reduce the number of comparisons to improve power after fdr adjustment, since these features are rarely informative. This behavior can be changed by setting the min_abundance and/or min_prevalence parameters. The log file will note which features have been filtered, if any.

If a sample has zero counts across all features (including after filtering features), however, nothing will be done by the software to account for this. In some cases, these could be biologically meaningfully zeroes, in others they aren’t, but there is no way for the software to tell, so it leaves the data as is. If a feature is in the metadata but not the data (or vice versa) based on row/column names of the inputs, it will be removed however, and this can also be seen in the log file.

Hope that helps, thanks.

– Tom

Hi Tom,

Thanks for responding! My full code is below, where I do actually set the min_abundance and min_prevalence to 0, as I have already previously filtered single/doubletons). In this case, would it be better for me to replace all the zeros with a dummy variable, for example, 0.00001, just for the software to be able to identify it not as a zero?

Thank you and Happy Holidays!(:
Carmen

fit_data = Maaslin2(
input_data = input_data,
input_metadata = input_metadata,
normalization = “CSS”,
standardize = FALSE ,
transform = “NONE”,
analysis_method = “NEGBIN” ,
max_significance = 0.05,
output = ./Maaslin2_Species",
fixed_effects = c(“Sample_3”),
correction = “BH”,
reference = c(“Sample_3,ALGFMT”),
min_abundance = 0,
min_prevalence = 0,
heatmap = TRUE,
plot_scatter = TRUE)

You can try to add a dummy value (probably 1 for count data is better), but I don’t see any reason you should need to.

Can you post the plots of significant results that were generated, if any?

Also, I’d recommend checking your installation of the software is up to date and perhaps running through the tutorial to make sure that you are getting the expected results there. Thanks.

Hi,

I didn’t think of it but adding a dummy variable is a great solution (also for other differential abundance tools).

Thanks!