Hi community!!! @sma @franzosa @sagunmaharjann
I am using MetaPhlAn output data as input for lefse in Galaxy. But, I am seeing that a single output graph shows various taxa level within it. This is my output. But I want only a single taxa level (e.g. only species) to be shown in a single plot. How can I get that? Am I doing anything wrong?
Hi dc -
LEfSe automatically goes up the taxonomy to test for features at all levels. Given your input is MetaPhlAn (so should have features at all taxonomy levels to begin with?), here is what I’d suggest:
- First, filter your MetaPhlAn table so that it only contains species level features.
- “Trick” LEfSe to not test features at all taxonomy levels. You can achieve this by manipulating the species names. You can either remove all the other taxa levels preceding the species name, e.g, “k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales|f__Corynebacteriaceae|g__Corynebacterium|s__Corynebacterium_matruchotii” -> “s__Corynebacterium_matruchotii”. Alternatively, replace the taxon separator “|” with something LEfSe won’t recognize, such as “;”, e.g., -> “k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Corynebacteriaceae;g__Corynebacterium;s__Corynebacterium_matruchotii”. The latter might be simpler to do, but gives you less pretty feature names (the full name will appear instead of just s__Corynebacterium_matruchotii).
Let me know if you’d need help on any of this.
Hi @sma I have already tested the first option. It did not work. I have seen that the first step (“format data for lefse”) was successfully completed. But, the second step (“LDA Effect size”) was failed.
You can also run something like:
for MetaPhlAn3 output on the command line
$ grep -E "s__|clade" merged_abundance_table.txt | sed 's/^.*s__//g' | cut -f1,3-8 | sed -e 's/clade_name/body_site/g' > merged_abundance_table_species.txt
For MetaPhlAn2 output
$ grep -E "(s__)|(^ID)" merged_abundance_table.txt | grep -v "t__" | sed 's/^.*s__//g' > merged_abundance_table_species.txt
To produce feature tables that only have the species level results. Both commands are from the MetaPhlAn tutorials: https://github.com/biobakery/biobakery/wiki/metaphlan2; https://github.com/biobakery/biobakery/wiki/metaphlan3 - this will give you more information on what exactly those commands are doing.
Hi @Kelsey_Thompson !!! This is the command I have used to generate species level merged abundance file:
grep -E “(s__)|(^clade_name)” merged_abundance_table.txt | grep -v “t__” | sed ‘s/^.*s__//g’ > merge_abundance_species.txt
here is a reproducible part from my data:
Hi @sma I also have replaced “|” with “;”. But, it didn’t help. Even the first step is failed in this case. Please tell me what to do now?
Would you mind sharing your input table with us? It’s a bit difficult to see what the issue might be without the file. You can reach me at firstname.lastname@example.org to share the file.
Hi @sma I have sent the data to your mail. Please have a look at it. My mail Id is email@example.com