Question regarding how min_prevalence and min_abundance work in Maaslin2


I had some questions about min_abundance and min_prevalence filtering before I run Maaslin2 (in R) on my data set.

I will give some details about my dataset first: I am planning on using relative abundance values for genera, calculated from normalized read counts, as opposed to using the read counts themselves, for my ‘input_data’.

Here is the code chunk:

Hrare_noFL_fit = Maaslin2(input_data = genus_rel_abundH.rarefied_noFL_4maaslin,
input_metadata = H.rarefied_noFL_sampledata,
analysis_method = “CPLM”,
normalization = “NONE”,
output = “Hrare_noFL_fit_output”,
fixed_effects = “Loc”,
reference = c(“Loc,CA”))

  1. If one feature (in my case, genus) is below the min_abundance level in one sample, is that feature excluded for all other samples where it otherwise does exceed the min_abundance level?

  2. “If both min_prevalence and min_abundance are specified in the Maaslin2 function, would a feature be excluded, for all other samples, if it does not meet: (i) both the min_prevalence and min_abundance levels, or (ii) either the min_prevalence or min_abundance levels?”

Thanks and please let me know if you need any more info,
Tammy :slight_smile:

An additional question: If I am using relative abundance for my input data, does Maaslin expect the values to be in percentage form (10, as in 10%) or in a fractional form (0.1, as in 10/100=0.1)?

Hello @tamardigrade

The min_abundance filter drops bugs that don’t have enough samples above the specified value in the unnormalized input data.

The min abundance and min prevalence checks occur before normalization so it will use whatever type of data you use as the input.

Jacob Nearing