Kegg on Humann2

Hi,

I was wondering if there was a way to get the keggpathways from Humann2 without installing Humann legacy. I am trying to confirm data I obtained containing metacyc pathways.

Is this possible without using the humann_legacy program and if that isnt the case, does anyone know where I can find a KEGG database folder to implement?

Best,
Dan

1 Like

I am also interested in mapping the MetaCyc output to KEGG pathways. Most of the online instructions for this process seem outdated. Any help would be appreciated!

Thanks,
Nastassia

HUMAnN 2.0 does not automatically generate KEGG pathway output. The manual provides instructions on how to generate KEGG pathway output using Legacy HUMAnN 1.0 functionality, but I agree that those results would be outdated at this point.

If you are able to download and reformat modern KEGG pathways on your own, you can use the regroup script to convert UniRef abundance to KO abundance and then provide KO abundance + KEGG pathway definitions to HUMAnN 2.0 to compute pathway abundance.

We’re looking into including KEGG pathways + modules as a MetaCyc alternative in the next iteration of HUMAnN.

Hi Eric,
Can you provide a link to download the following mapping files to use the regroup script? Thank you.

Mappings are available for both UniRef90 and UniRef50 gene families to the following systems:

** MetaCyc Reactions*
** KEGG Orthogroups (KOs)*
** Pfam domains*
** Level-4 enzyme commission (EC) categories*
** EggNOG (including COGs)*
** Gene Ontology (GO)*
** Informative GO*

If you have a HUMAnN 2.0 install you can get these with the humann2_databases script (look for the “utility_mapping” option).

I’m using Galaxy platform. The regroup script need to supply the mapping file manually. I’m wondering is there anyway to download the database. I found a similar thread in old Google Groups discussion, but the link you provided there has expired. Thank you!

@erichx Gotcha. Here is a link to all the mapping files:

http://huttenhower.sph.harvard.edu/humann2_data/full_mapping_1_1.tar.gz

Thanks for humann2, I have a question here:you provide a uniprotID <-> KO or GO mapping for uniref90 and uniref50. How did you obtain those?

Thanks!

Mappings from UniRef to other systems (including KEGG) are sourced from UniProt’s annotations of the representative proteins.

Hi,
I want to get the abundance value for KEGG pathway as well. At the moment, I am able to get the KO ids from uniref90. In order to get the KEGG pathway abundance, what should be my next step?
When you say download and “reformat” the pathway, what does that mean exactly?

Hi, does HUMAnN 3.0 support this feature ?

HUMAnN 3.alpha is currently packaged with HUMAnN 2-era MetaCyc pathways. I’m hoping to add newer MetaCyc pathways and KEGG modules to the non-alpha release.

1 Like

I can see a lot of mapping files in full_mapping directory,but the mapping file of MetaCyc Reactions could not be found.Can you provide a link to download it?
Thanks!

Mapping files in full_mapping directory:

map_ec_name.txt.gz          map_go_name.txt.gz      map_ko_uniref50.txt.gz        map_pfam_name.txt.gz       map_uniref50_uniref90.txt.gz
map_eggnog_name.txt.gz      map_go_uniref50.txt.gz  map_ko_uniref90.txt.gz        map_pfam_uniref50.txt.gz   map_uniref90_name.txt.bz2
map_eggnog_uniref50.txt.gz  map_go_uniref90.txt.gz  map_level4ec_uniref50.txt.gz  map_pfam_uniref90.txt.gz   uniref50-tol-lca.dat.bz2
map_eggnog_uniref90.txt.gz  map_ko_name.txt.gz      map_level4ec_uniref90.txt.gz  map_uniref50_name.txt.bz2  uniref90-tol-lca.dat.bz2

The MetaCyc reaction mapping is packaged under the pathways/ folder since mapping from UniRefs to reactions is part of pathway quantification. The regroup_table script knows to look for that specific file in the pathways/ location (to avoid storing it twice).

Hi guys,

I’m using this thread as my question is linked to the utilization of the different types of mapping files packed with HUMAnN.

I have annotated a bunch of predicted ORFs using eggNOG-mapper, and now I want to map the eggNOG ID to a single UniRef90/50 ID, but given that one eggNOG ID can be mapped to multiple UniRef IDs, the Q is then: how may I get a simple 1:1 mapping? Could you please give me some hints as to how to go about this, or perhaps point me to a dedicated script to achieve this?

Thanks a lot for your feedback!

As you say, because the UniRef:eggNOG mapping is many:1, there is no single best UniRef corresponding to a given eggNOG. You could alternatively annotate your ORFs directly against UniRef (e.g. by DIAMOND blastp’ing them against the HUMAnN database and taking ~the best hit for each one). That would give you something like a 1:1 mapping/pairing in that each ORF would then be mapped to 1 UniRef and 1 eggNOG.