I’m using lefse to analyze a dataset comparing two groups and the Kingdom, Bacteria, was returned as significant. I think this in error because the only Kingdom in the dataset is “Bacteria.” The feature plot (below) for that hit shows 8/10 samples have an abundance of 1, but when I add up the values for the columns in the table manually I get 1 for all 10 samples (as expected for a relative abundance plot). I do have one Taxon that couldn’t be assigned past Bacteria (the rest are Bacteria|Phylum|…|family/genus), but the abundance of of the one assignment is not 1 in any of my samples (though it is 0 in two control samples), so I’m not sure if that assignment is the issue.
Can you help me understand why this hit is coming back as significant and why it’s not counting the abundance appropriately? And if it is the lone Bacteria assignment could you explain why it’s not being included in the overall Kingdom calculation of LEfSe?
Here is my relative abundance table: rel_freq_col_LF100_SI_min2_L6_lefse.txt (12.4 KB)
These are the commands I ran to get to this figure (I’m using the conda install of LEfSe):
format_input.py rel_freq_col_LF100_SI_min2_L6_lefse.txt formatted_LF100-SI_min2_L6.in -c 1 -o 1000000
run_lefse.py formatted_LF100-SI_min2_L6.in lefse_out_LF100-SI_min2_L6.res -w 1
plot_res.py --dpi 300 --format png lefse_out_LF100-SI_min2_L6.res lda_out_LF100-SI_min2_L6.png --width 10
plot_features.py --dpi 300 --format png -f diff formatted_LF100-SI_min2_L6.in lefse_out_LF100-SI_min2_L6.res sig_features_SI-LF100/