I’m looking to use MaAsLin2 to identify OTUs that are significantly associated with disease state (Y/N). I have multiple samples per healthy individual, but only one sample from each diseased individual. Therefore, I’d like to control for the pairs of samples present in the healthy/non-diseased group.
My initial thought was to use MaAsLin2 to run a model with disease state (Y/N) as the fixed effect and use subject ID (to control for multiple samples per individual in the healthy/non-diseased group only) as the random effect.
I don’t get any errors, but I’m concerned about the output. Out of the 412 OTUs, there are approximately 60 or so that are deemed to be significant. All of these OTUs are associated with the diseased group (disease=Y). I find this to be a bit odd, especially considering that this was not the case when I identified these OTUs using differential abundance analysis (DESeq2). Further, no heatmap was generated. I just wanted to be sure that my methods were appropriate before I consider including this in a manuscript.
I am pretty sure that my input data is correct. I used a conventional OTU table with rows as taxa. The metadata has rows as individual samples and the variables of interest are factors. Am I possibly doing something wrong? I am concerned with the sparseness of my OTU table (the samples are low biomass). I assume a zero-inflated model would be more appropriate, but then I could not incorporate fixed/random effects.
Thank you in advance for your time and hard work on this package!