High rates of significant results using NEGBIN

Hi folks,

I am analyzing a metagenomic dataset with abundance calculated by kraken2. I can use either counts or proportions of reads. I have one metadata category at the moment with one reference category and 3 test categories. If I run this setup using read proportions through Maaslin2 with LM/TSS/LOG, I get no significant results. However, if I use raw read counts with NEGBIN/CSS/NONE, I get plenty of significant results that somewhat seem reasonable.
I read through the Maaslin2 manuscript and noticed that NEGBIN had a very high rate of false positive discovery (Fig. 2). Are the results I am seeing due to NEGBIN’s inflated FDR? Or are these results real?

It’s not possible to answer your question without data, but you’re right to be skeptical given the disagreement. I suggest you plot the raw data (both as proportions and counts) for some of the features with the biggest discrepancy and judge which model would be more appropriate.

Thanks for the idea!
Just to confirm, the negative binomial model shown in the Maaslin paper is the same method as NEGBIN in the Maaslin2 program?

Yes. You might find the tutorial helpful as well.