High rates of significant results using NEGBIN

joshkirsch · April 26, 2023, 9:13pm

Hi folks,

I am analyzing a metagenomic dataset with abundance calculated by kraken2. I can use either counts or proportions of reads. I have one metadata category at the moment with one reference category and 3 test categories. If I run this setup using read proportions through Maaslin2 with LM/TSS/LOG, I get no significant results. However, if I use raw read counts with NEGBIN/CSS/NONE, I get plenty of significant results that somewhat seem reasonable.
I read through the Maaslin2 manuscript and noticed that NEGBIN had a very high rate of false positive discovery (Fig. 2). Are the results I am seeing due to NEGBIN’s inflated FDR? Or are these results real?

andrewGhazi · April 26, 2023, 10:27pm

It’s not possible to answer your question without data, but you’re right to be skeptical given the disagreement. I suggest you plot the raw data (both as proportions and counts) for some of the features with the biggest discrepancy and judge which model would be more appropriate.

joshkirsch · April 26, 2023, 10:57pm

Thanks for the idea!
Just to confirm, the negative binomial model shown in the Maaslin paper is the same method as NEGBIN in the Maaslin2 program?

andrewGhazi · April 27, 2023, 1:14pm

Yes. You might find the tutorial helpful as well.

Topic		Replies	Views
Output data reduction MaAsLin	1	312	August 26, 2022
Metagenomic and min_abundance filtering MaAsLin	2	715	February 14, 2023
MaAsLin2: possible with 2x4 samples? MaAsLin	1	411	November 5, 2021
Number of Reads Metagenomic Data for Maaslin3 MaAsLin	5	51	January 28, 2025
Choosing analysis method for maaslin2 MaAsLin	10	4601	May 6, 2024

High rates of significant results using NEGBIN

Related topics