Parameters in the lefse analysis

rmathur · September 23, 2021, 7:24pm

I am trying to better understand the parameters for the lefse analysis. Specifically the LDA score threshold, I see 2.0 is used a lot. Is there a significance of such a value? Secondly, for normalization value I see 1,000,000 is used a lot. Is there a recommended value based on the our data? Also is there a rational as to why this normalization is need when relative abundance is an input?

mishort · September 27, 2021, 5:05pm

Hello,
Thanks for your questions. The default LDA score threshold of 2 is what the LEfSe paper used in testing/demonstrating LEfSe, which is likely why it is widely used. However, it can be adjusted by the user, for instance, if many features are differentially abundant with LDA score >2, it can be useful to use a more stringent threshold. I don’t think there is significance to the value of 2.0 per se, only that it was a sufficiently large difference in abundance to be potentially biologically meaningful.
Likewise, my understanding is that the option to normalize per-sample read counts to 1M is meant to improve the calculation of LDA scores for features with low read counts; the exact number is not meaningful.
I hope that helps!
Meg

Topic		Replies	Views
If I set the LDA score below the default value of 2.0, is it still statistically reliable? LEfSe	1	877	May 14, 2021
LDA score cutoff changes slightly the LDA scores LEfSe	1	466	October 5, 2023
Can I reduces the default LefSe LDA score? At least how much LDA cut off should I consider to consider? LEfSe	0	736	January 8, 2021
Result issue for LEFSE LEfSe	1	912	January 2, 2020
Why the "class" and "LDA effect size" column for some taxa are empty while alpha value set to 1.0 for both tests? LEfSe	3	822	September 17, 2020

Parameters in the lefse analysis

Related topics