I hope you are doing well. I am reaching out with what I think is a serious problem with LEfSe. We have just realized that modifying names of features (e.g. genes or species) changes the results of the analysis. For example,in one dataset I am currently working on, replacing “sp._oral_taxon” with “HOT” to make the names shorter, results in 21 differentially abundant species instead of 22, and the LDA score also changes. When I tried to replace the species names with just numbers, I get 16 differentially abundant taxa instead. I have suspended submitting a manuscript until this is resolved.
I am facing the same problem with both the Galaxy and command line versions. Please find attached 3 versions of the input file I have used recently:
1- With full species names
2- With abbreviated species names
3- species names replaced with numbers
I obtain 22, 21, and 16 differentially abundant taxa with these files, respectively, at LDA score of >= 2.5.JIA_species_level_names_changed.txt (163.6 KB) JIA_species_level_names_replaced_by_numbers.txt (159.4 KB) JIA_species_level.txt (164.3 KB)