A similar but not identical query already exists in the forum however it wasnt explained in detail so I am posting my query here.
I am trying to use the lefse for finding group of taxa which can successfully classify two disease groups from each other. As far as I understand, Lefse uses factorial KW rank sum test to select features that are differentially abundant between the two classes (say disease 1 and disease 2) and it used Wilcoxon test to further screen features which are consistently differentially abundant between the subclasses.
However in my dataset, there are no subclasses. In that case does Lefse skips Wilcoxon test ? And if it does, then are the selected biomarkers at the end of Lefse considered successful biomarkers or not?
Thank you for the help.
Sorry for my late reply. I understand what you mean, and I could not find answer in the thread I specified, as you stated. From my current understanding,
Absence of subclass, LEfSe will perform per-feature KW, and NOT skip wilcoxon unless you specified --wilc option to 0. Instead, LEfSe seem to perform both KW and Wilcoxon between class. I was examining format_input.py and noticed this. I made another thread regarding this (if it is valid for lefse to perform KW and Wilcoxon both between class). Question about LEfSe input_format.py when specifing no subclass
I think yes, considered they went through KW and LDA, however, in this case (–wilc 0), the p-value from KW, which is considered to be reported whether wilcoxon is performed or not, is not reported. I additionally asked this problem in the thread above.
I’d agree with darmecian here. For 1, LEfSe will still perform Wilcoxon unless otherwise specified. For 2, I’d say so. Even in the absence of subclass tests LEfSe still filters for features that a) pass the KW test and b) has strong LDA score support. The two together should provide enough evidence for biomarkers.