Hello,
Thanks for your question. The problem is in the values you supplied after the options -c -s and -u when you ran format_input.py. To see the documentation on these options, run the following:
$ format_input.py -h
Whichever row in your data corresponds to the class variable should be supplied after -c, and the ID variable row should be supplied after -u. If there is no subclass, you do not need to supply the option “-s” at all.
Exact problem is, though I set -c and -u options appropriately, LEfSe compared differences among “subject”, not “class”.
In the case of tutorial data which I analyzed for a test (the resulting plot was uploaded previously) (https://github.com/biobakery/biobakery/raw/master/demos/biobakery_demos/data/lefse/input/hmp_small_aerobiosis.txt), classes were in the first row and subjects were in the third row. According to the row position in the input file, I put the options -c 1 and -u 3 to set “row 1” as “class” and “row 3” as “subject”.
*Classes in the tutorial data are ‘High_O2’, ‘Mid_O2’, and ‘Low_O2’. Subjects are 158398106, 158742018, 158984779 etc…
As I understand, $lefse-plot_res.py supposed to plot differences among classes (High_O2, Mid_O2, Low_O2).
However, as you can see in the above resulting plot I previously uploaded, it plotted differences among subjects (158398106, 158742018, 158984779 etc…).
Is there anything I am missing? or would it be a problem with my computer settings?
Hello,
I ran through the tutorial and it works as intended for me (the grouping variable is correct), and I wanted to check that you’re using the latest version of lefse and the tutorial. Here is the link to the tutorial I was following:
I ask because the commands you pasted above are slightly different from the ones used in the tutorial ("lefse-format_input.py vs format_input.py, for instance).
Could you try running “conda update lefse” and then re-following the tutorial linked above, and letting me know if it works?
Thanks,
Meg