Humann2_regroup_table for kegg : UNGROUPED!

fgndd · November 11, 2022, 8:07am

Hi, I’m using humann2_regorup_table to convert Uniref90 into KO with mapping file “map_ko_uniref90.txt.gz” , my script is here:

humann2_regroup_table -i input_genefamily.tsv -c map_ko_uniref90.txt.gz -o output.tsv

But i found most of the reads are UNGROUPED in the output file.

And then i found 317015 out of 352836 uniref90_ids in input_genefamily.tsv don’t have any record in the mapping file “map_ko_uniref90.txt.gz” .

I also tried to build a custom-protein-reference-database of kegg and get the KO abundance output with MinPath manually , and when i compared the new output with the former, they don’t show any consistency or statistical correlation.

Is it reasonable to get the KO abundance from the original genefamily table using humann2_regroup_table ?
Do you have any suggestion about how to get the KO abandance of a metagenome sample?

Thanks for your reply!

franzosa · December 1, 2022, 9:39pm

This is a reasonable approach. KO annotations are relatively rare among UniRef90s (perhaps 10%?), which is why you’re seeing so many UniRef90s not regrouped to KOs. The same is true of ECs. If you want to regroup to something broader than a UniRef90, you can use UniRef50s or (my preference) Pfam domains. Both UniRef50s and Pfams have strong coverage of UniRef90 but Pfams are less numerous and better annotated.

fgndd · December 13, 2022, 10:04am

it seems regrouping uniref90s to KO not a good idea, thank for your answer anyway!

franzosa · December 15, 2022, 3:23pm

Regrouping tends to introduce a trade-off: you end up with a smaller number of better understood features (e.g. KOs) but you lose the resolution and coverage provided by the original UniRef90 units. I do tend to analyze some sort of regrouped features by default unless I’m specifically interested in pointing out individual genes for follow-up analysis. For example, if you want to knock out a potentially interesting gene from a bug of interest, then analysis at the UniRef90 level would make more sense.

Topic		Replies	Views
Humann_regroup_table for uniref90 KO HUMAnN	5	2051	June 29, 2022
Humann regroup with low Original Feature Count HUMAnN	1	293	September 27, 2021
Cannot regroup into kegg pathway in Humann3 HUMAnN	10	2426	August 20, 2021
Mapping KOs to Uniref90 in humann3 HUMAnN	1	170	June 20, 2024
High proportion of Unmapped Uniref90 reads (and very few KOs after regroup) HUMAnN	1	575	August 3, 2020

Humann2_regroup_table for kegg : UNGROUPED!

Related topics