The bioBakery help forum

LEfSe feature names dependency

What I actually noticed is that the results of LEfSe seem to depend on
the feature names. Is this known? When I rerun several times the *.res
file had the same md5 checksum. When I change the names of e.g. OTUs
(from e.g. OTU_1, OTU_2, OTU_100, to OTU_001, OTU_002, OTU_100) but
leave the input file otherwise identical the LDA scores can be quite
different (up to 15%), yet reruns are again identical. Does some
sorting take place in the format_input.py?

Hi -
This seems to be related to this thread:


Could you check if the discussion there clarifies the issue?
Thanks!
Siyuan