The bioBakery help forum

Question about LEfSe input_format.py when specifing no subclass

Hello, I have a question to ask regarding the script input_format.py, and subsequent run_lefse.py.

I performed input_format.py specifying no subclass (-s -1, meaning no subclass). However, when I examined output file, there are “subclass” row present in the table, which is the same name plus “_subcl” as “class” (e.g. if class was “disease”, there was “disease_subcl”). Is this behaviour okay?

If okay, when I run run_lefse.py with the wilcoxon test option turned on, the script perform K-W and Wilcoxon for same grouping? I noticed that when I run with Wilcoxon test turned off, the script reported no p-value, while the p-value reported in the table are expected to be p-value from K-W test. so I ask whether it is okay to turn on the wilcoxon test when subclass are identical to class.

Thanks in advance.

EDITED:
I tried to attach files as minimum reproducible example using hmp_small_aerobiosis.txt in wikipedia of lefse, however, because of the limitation of new users, I could not.
the code is:

format_input.py hmp_small_aerobiosis_sub.txt sample.in -f r -c 1 -u 2 -s -1 -o 1000000 --output_table sample.input
run_lefse.py -r lda -l 2 -b 100 --wilc 0 --verbose 0 sample.in sample.res

input file is:

hmp_small_aerobiosis_sub.txt: raw input file, I removed the body_site row from original file.
sample.input: output file of format_input.py, which have subclass row named “[class name]_subcl”
sample.res: output file of run_lefse.py, which is without p-value

Hi,
Many apologies for the long time to reply! If you didn’t find a solution, here’s a work-around to output p-values, while “not running” Wilcox filtering:


Thanks!
Siyuan

Hello Siyuan,
Thank you very much for clarifying!