I am using LEfSe analyses in the Galaxy webserver, and all analyses run correctly but I have a question about the subclass. In one of the datasets I’m using, I have no subclass, but the LEfSe analyses gives me the following information:
Number of significantly discriminative features: 340 ( 340 ) before internal wilcoxon Number of discriminative features with abs LDA score > 2.0 : 219
I though the Wilkoxon test would not be performed if I didn’t provide subclasses. Can you explain me what happened between the 340 features that were significant before the wilcoxon and the 219?
I am also having some trouble understanding if the parameter “Set the strategy for multi-class analysis” in step B refers to class or to subclass. As I read in your paper “Metagenomic biomarker discovery and explanation”, this multiclass strategy is to be applied on the classes but then the Wilkoxon text is mentioned, and I think this misunderstanding I’m having here is also connected to my next question.
Additionally, I have another matter that I would like to clarify:
I did LEfSe analysis using a dataset with 2 classes and when I obtain the LDA score plot (from step C in the webserver) which has one horizontal bar per each significantly discriminative feature and I understand that the bar colour corresponds to the class in which the feature was significantly more abundant. Following this analysis, I did an analysis using a dataset with 4 classes and I obtained the LDA score plots (from step C), which have one horizontal bar per significant feature - does this mean that the feature is significantly more abundant in the respective class than in all other 3 classes? What does this bar say about the other 3 classes?