Question about a strange result

Question: Why is Microascus differentially abundant?

The attached png titled ‘time_from_first_planting_1’ is generated from Maaslin2. The model is plotted out for Microascus. The model doesnt look like something that would have a significant FDR/qvalue, or have a coefficient of 2.95.

I also split my relative abundance data up in to a first half of the season, and a second half, and calculated average relative abundance. If you look at the attached Table 14, youll see that the relative abundance for Microascus is the same for the first and second half of the season (=0.02). Whereas, the relative abundance of Davidiellomyces does change between the first and second half of the season. The plot for Davidiellomyces is also uploaded, as time_from_first_planting_2, for reference.

(Just an FYI- I calculated the relative abundance for Microascus and Davidiellomyces between locations, not Maaslin2

Here is my code:

H_genus_dfsos_CAref = Maaslin2(input_data =genus_rel_abundH_noFL_4maaslin,
input_metadata = Hsampledata_dfsos,
analysis_method = “CPLM”,
transform= “NONE”,
min_prevalence = 0,
min_abundance = 0,
normalization = “NONE”,
random_effects = c(“Monthlot”),
output = (“/local1/workdir1/tw488/CIDA/ITS_Maaslin2_Output/H_GENUS_ITS_dfsos_output_08142024”),
fixed_effects = c(“Location”,“time_from_first_planting”),
reference = c(“Location,CA”),
max_pngs = 200,
save_models=“TRUE”,
save_scatter=“TRUE”,
cores=2)

Thanks so much!


.)

Hi @tamardigrade,

That association does look fairly odd. It’s also a good idea to check the diagnostic plots!

In terms of what might be happening… I noticed your using the CPLM analysis method. Do you have a particular reason for this? In general we would recommend using the default analysis method when possible especially when working with microbial sequencing data.

It also looks like potentially your samples might have particularly low read counts given the increments of 0.0025 in the microascus plot?

thanks,
Jacob

Hi,
Hi,

By diagnostic plots, do you mean the output in the ‘fits’ folder?

We decided to go with CPLM based on this info form the tutorial in section 3.2: “…CPLM or a zero-inflated alternative should perform better in the presence of zeroes…” (event though just after it gives a caveat: “…but based on our benchmarking, we do not have evidence that CPLM is significantly better than LM in practice.”)

The samples don’t have low read counts but the read counts for this genus are low.

Thanks!