The bioBakery help forum

Format_input.py parameters: how do they affect the pipeline and the plots?

Hello everyone,

I am trying to analyse some data with LEfSe on anaconda (python).
I successfully execute the script with the example data

.

Is the cladogram supposed to look like this? If so, how can I change the colors for the biomarkers? If not, I believe it is because no biomarker clade was found (I could not generate the other picture (step 3, see below)).

The code I run is the following:

bin/format_input.py tmp/sample.txt tmp/merged_abundance_table.lefse
bin/run_lefse.py tmp/merged_abundance_table.lefse tmp/merged_abundance_table.lefse.out -l 4
bin/plot_res.py --dpi 300 tmp/merged_abundance_table.lefse.out output_images/lefse_biomarkers.png
bin/plot_res.py --dpi 300 tmp/merged_abundance_table.lefse.out tmp/lefse_biomarkers.png

I am not sure about the format_input.py parameters (-c,-s,-u,-o) what do they do? Online I could not find any info and the code where they are used is not commented so, before trying to back-engineer everything I am glad to ask.

Cheers,

Pietro

Hi Pietro,

Thanks for the question! The cladogram you produced does look correct if there were no significant features to plot on it.

As for the parameters:
-c = row of the data to use as the class
-s = row that contains the subclass information
-u = the row with the subject information
and
-o = the normalization (for LEfSe the default is [1.0] or none)

I have attached the help page for the format LEfSe step. I hope this helps. Let us know if we can do anything else.

Best,
Kelsey

Screen Shot 2020-05-14 at 7.01.42 AM