Hello, I have a question to ask regarding the script input_format.py, and subsequent run_lefse.py.
I performed input_format.py specifying no subclass (-s -1, meaning no subclass). However, when I examined output file, there are “subclass” row present in the table, which is the same name plus “_subcl” as “class” (e.g. if class was “disease”, there was “disease_subcl”). Is this behaviour okay?
If okay, when I run run_lefse.py with the wilcoxon test option turned on, the script perform K-W and Wilcoxon for same grouping? I noticed that when I run with Wilcoxon test turned off, the script reported no p-value, while the p-value reported in the table are expected to be p-value from K-W test. so I ask whether it is okay to turn on the wilcoxon test when subclass are identical to class.
Thanks in advance.
EDITED:
I tried to attach files as minimum reproducible example using hmp_small_aerobiosis.txt in wikipedia of lefse, however, because of the limitation of new users, I could not.
the code is:
format_input.py hmp_small_aerobiosis_sub.txt sample.in -f r -c 1 -u 2 -s -1 -o 1000000 --output_table sample.input
run_lefse.py -r lda -l 2 -b 100 --wilc 0 --verbose 0 sample.in sample.res
input file is:
hmp_small_aerobiosis_sub.txt: raw input file, I removed the body_site row from original file.
sample.input: output file of format_input.py, which have subclass row named “[class name]_subcl”
sample.res: output file of run_lefse.py, which is without p-value