Mapping KOs to Uniref90 in humann3

Dear developers and users,

I would like to annotate the KEGG Orthogroups (KOs) from the output of gene families obtained from humann. To do so, I would like to use the most up-to-date version of the annotations linking the KO terms to the Uniref90 from uniprot database.

In other posts on this forum I have found that the humann utility map accesory files are built by parsing the uniprot DAT references: Index of /pub/databases/uniprot/knowledgebase

From here I have a couple of questions:

  1. To build the map_ko_uniref90.txt file you have parse the UniProtKB/Swiss-Prot reference (uniprot_sprot.dat.gz) or UniProtKB/TrEMBL (uniprot_trembl.dat.gz)?

  2. I can’t find the KO in this file, am I doing the process wrong or I need to perform any extra step?

Thank you very much in advance,
Irene

Historically the UniProt to KEGG Orthogroup mapping was contained on DR lines from these DAT files. However, checking a current example:

https://rest.uniprot.org/uniprotkb/Q99798.txt

This appears to no longer be the case. The only KEGG cross-reference is to the KEGG gene ID but not the KO term, which is challenging.