Error in Format Data for LEfSe

Hello,
I am not able to formet the data for LefSe and getting the error message as:
Traceback (most recent call last):
File “/usr/bin/lefse_format_input.py”, line 8, in
sys.exit(format_input())
File “/usr/lib/python3.6/site-packages/lefse/lefse_format_input.py”, line 437, in format_input
class_sl,subclass_sl,class_h

Please help me out.
Thanks in advance,
Ankita MAddheshiya

Hi @Ankita,
Can you upload the the input file please? I can check from my end.

Regards,
Sagun

I am also having a similar issue.

1 Like

Hello @sagunmaharjann,

Attached the file here.

Regards,
Ankita
L2.txt (13.5 KB)

Did you get a solution? I’m having the same issue even when I use input files that have worked in the past.

Has a solution been found for this problem? I am receiving the same message even when trying to use input files that have worked with Lefse in the past.
Thanks

Same thing is happening with me. I haven’t found any solution till now.

Also having the same issue (again with files that have previously run ok).

Managed to get it to run by downloading an updated “lefse_format_input.py” file from: https://github.com/SegataLab/lefse/tree/master/lefse

I tried that, tried putting it in the following directory:
/Users/nate/miniconda3/envs/lefse/bin

I received error that I did not have permission. I chmod +x
And running into same problem I had with original lefse_format_input.py:
“variables 1 2 3 4 5 6 7 8 appear to be constant within groups”

I found somewhere that is due to an outdated lefse_format_input.py

I have tried with and without subjectID, (-u)
And with the server down, not sure how to proceed.

could it have anything to do with sample size per treatment? I have at least one treatment with only two samples. Wondering if that is to blame

Any suggestions would be greatly appreciated! TIA

actually, it wasn’t the sample size, well not directly, I needed to further filter the taxa I was using as some of them were only present in 2 samples. When I filtered to use taxa only present in at least 3 samples, I did not get the constant variable error.

I am wondering why exactly the error occurs. Having compared data where the error occurs (retaining all taxa found in at least two samples, or s2) versus where it doesn’t occur (retaining all taxa found in at least three samples, or s3), it is hard to find any obvious rule. Six taxa in the s2 dataset are found in only one treatment and not others - I thought that may be the rule but there are eight taxa triggering the error.

There are a couple other taxa that are found almost exclusively in one treatment, with only one more occurrence; they could account for the other two variables.

This brings me to another question - do I have my variables coded incorrectly or does the script not tell you which taxa by name, just “variables 1 2 3 4 5 6 7 8”? It would be nice to know which ones. Is that the first eight variables? If that is the case, I am barking up the wrong tree wrt rules triggering the error.

Any help in figuring this out would be greatly appreciated.