HUMAnN3 - mapping between UniRef90 to EC-enzymes

Vadim_Dubinsky · October 1, 2023, 12:17pm

Dear developers and users,

In HUMAnN3 in utility_mapping accessory files, there is file called map_level4ec_uniref90.txt.gz which maps EC-enzymes (4-numbers) to UniRef90 protein ids.

I was wondering how/from where I can generate such a mapping table myself? If for example I’d like to use the latest UniRef90 (which is updated ~8 weeks) database, and want to map them the ECs annotation.

I know this topic is not directly related to HUAMnN3, but still I’d appreciate any help!

Thanks
VadumDu

franzosa · October 5, 2023, 4:20pm

We build these files by parsing the big “DAT” file that comes with each UniProt release. It is the file that looks like a concatenation of all these per-protein details:

https://rest.uniprot.org/uniprotkb/P06493.txt

Most EC annotations come from the lines starting with DE, but they can occasionally be found in comment lines (CC) and via the cross-references to the BRENDA database (DR BRENDA).

Vadim_Dubinsky · October 5, 2023, 6:11pm

Many thanks you for the answer @franzosa !

I can’t access the parent directory of this link though. It was just an example for a single record yes?

Do you refer to this huge file that come with each UniProtKB release: uniprot_trembl.dat.gz ?
There is also a much smaller uniprot_sprot.dat.gz - but it only covers the Swiss-Prot and I guess it will result in only a partial mapping?

Thanks again
VadimD

franzosa · October 5, 2023, 6:31pm

Correct, that was just an example of the formatting. You will want to consider both the full SwissProt and TrEMBL files. Note that if HUMAnN reports a gene family like UniRef90_XYZ, then XYZ will be an accession number in one of those files (unless the sequence has been retired).

Topic		Replies	Views
Low number of EC IDs mapped from gene families in HUMANn3 HUMAnN	4	771	October 5, 2020
EC numbers and bacteria HUMAnN	1	111	April 11, 2024
Mapping KOs to Uniref90 in humann3 HUMAnN	1	183	June 20, 2024
UniRef90 to UniRef50 conversion using HUMAnN3.0 HUMAnN	1	207	October 20, 2023
Humann_regroup table - uniref90_rxn vs uniref90_level4ec HUMAnN	2	1593	November 17, 2020

HUMAnN3 - mapping between UniRef90 to EC-enzymes

Related topics