Only ~3% of UniRef90 gene families are regrouped to KO in HUMAnN3 — advice appreciated

TKurakawa · July 4, 2025, 9:08pm

Hello,

I am currently analyzing metagenomic data using HUMAnN3. After generating and merging the gene families table, I attempted to regroup UniRef90 gene families to KEGG Orthologs (KOs) with the following command:

humann_regroup_table --input genefamilies.tsv --output genefamilies_ko.tsv --groups uniref90_ko

The output included the message:

Original Feature Count: 503760; Grouped 1+ times: 16702 (3.3%); Grouped 2+ times: 78 (0.0%)

indicating that only about 3.3% of the UniRef90 features were assigned to KOs.

I am using the UniRef90 database version uniref90_201901b_full.dmnd, which I believe is up-to-date.
The input gene families table appears to be correctly generated and contains tens of thousands of UniRef90 IDs.
No errors or warnings occurred during the regrouping or normalization steps.

However, this KO assignment rate (~3.3%) seems unexpectedly low compared to literature reports and other analyses, where KO mapping rates often range between 50% and 80%.

Could you please advise on:

Common reasons or factors that might cause such a low KO regrouping rate?
Recommended checks or steps to troubleshoot or improve the KO assignment?
Whether the sample type or environment could significantly affect the KO mapping rate?

Any insights or suggestions would be greatly appreciated.

Thank you very much for your support!

Let me know if you want me to help you post it or adapt it further!

Topic		Replies	Views
Humann regroup with low Original Feature Count HUMAnN	1	295	September 27, 2021
Genes assigned to KEGG Orthology HUMAnN	1	343	June 28, 2022
Humann2_regroup_table for kegg : UNGROUPED! HUMAnN	3	665	December 15, 2022
Humann_regroup_table for uniref90 KO HUMAnN	5	2054	June 29, 2022
High proportion of Unmapped Uniref90 reads (and very few KOs after regroup) HUMAnN	1	579	August 3, 2020

Only ~3% of UniRef90 gene families are regrouped to KO in HUMAnN3 — advice appreciated

Related topics