Hello,
I am working on a dataset with 202 samples and 10,032 predicted KO-IDs that I got as an output from q2-PICRUSt2 plugin. I am trying to feed the .txt version of this file to LEfSe module on Galaxy to understand the enrichment of predicted KO-IDs in 3 different groups of my project. I keep getting the following error when I upload the data on the Galaxy server:
I have tried “one against all” option instead of “all against all” in the LDA Effect Size step. I have also tried less stringent p-values for both the Kruskal-Wallis and Wilcoxon tests together and separately (NOTE: there are no subclasses in my dataset - just 3 different groups which I want to treat as classes, however I still reduced the stringency on Wilcoxon, with the hope that I might get some result).
In a follow-up attempt to decipher what might be happening, I used a smaller dataset (202 samples, but only 36 predicted KO-IDs). I had extracted these specific KO-IDs of interest for another analysis pertaining to the same project. When I uploaded this “significantly” smaller dataset to run LEfSe on Galaxy, I got the results smoothly.
This leads to be speculate if there is a limit to the dataset size that we can upload to Galaxy server in order to run LEfSe. Is this the case? If so, what can be done to smoothly upload large datasets for enrichment analysis using LEfSe? Can someone help me with this issue? Any leads/suggestions/ideas would be extremely helpful.
Thank you all so much,
Aakarsha