The bioBakery help forum

Kegg pathway for humann3 outputs


I want to get metabolic pathways from kegg output, especially kegg pathways, but didn’t find a good way to do so. I tried:

humann_renorm_table -i test_genefamilies.tsv -u relab -p -o test.ra.tsv
humann_regroup_table -i test.ra.tsv -o test.tsv -g uniref90_go -e 6

humann_rename_table -i test.tsv -o test.ko.tsv -n kegg-pathway

But 0 of ~2900 entries were renamed. Same result if I change “kegg-pathway” to “metacyc-pwy”. But if I use “humann_rename_table -i test.tsv -o test.ko.tsv -n kegg-orthology”, 86.15% were renamed. I was wondering if I did anything wrong in choosing the database for use? How can I get pathway results? I really appreciate your help!

In your example you are regrouping to GO (gene ontology) terms. Maybe you meant to regroup to KOs (KEGG Orthogroups?). However, even in that case, KOs are not the same as KEGG pathways, though you could attach names to them.

KOs are more equivalent to MetaCyc’s reactions (from HUMAnN’s view). If you’ve quantified KOs with HUMAnN and have your own KEGG pathway definition file you could use that to quantify KEGG pathways. We don’t bundle KEGG pathway definitions with HUMAnN though.

Got it, thank you so much for your help!

@franzosa we ran into the same problem, which caught us by surprise. We can work around this a bit by (for example) using the KO IDs and a secondary tool like limma’s kegga to perform some simple pathway enrichment analysis

Are any other pathway level conversions supported? The ‘humann_regroup_table’ script is documented to use either gene or pathway abundance estimates as input, so it might be worth noting in the documentation if pathway conversions aren’t supported in general, or (if it’s primarily KEGG) that METACYC → KEGG pathway conversion is explicitly not supported.

Sorry for the late reply - in principle you can use regroup_table to sum or average any class of entities into any other set of entities (provided that you give the script a custom mapping file). We only support (via the included mapping files) regrouping from UniRefs to broader gene/functional categories via summing. HUMAnN’s internal logic can be used to compute pathway abundance, and does so by default using the supplied MetaCyc pathway definitions, but you can pass custom pathway definitions to HUMAnN itself for e.g. KEGG.