Example for `File Type 4 (gene table) tsv` as input to HUMAnN?

Harithaa-Anandakumar · November 23, 2021, 7:04am

I am unable to find an example file for how the input for a gene table should look like as an input for HUMAnN.
Is there an example file somewhere?
Can I use output from mapped from GMGC ?

Thanks!
Harithaa

w_ceasea · November 25, 2021, 7:53am

Hi,

The command is

humann -i genetable.tsv  --input-format genetable -o functions --diamond ../../

A gene table looks like this,

# Gene Family   E100002696_L01_1_kneaddata_Abundance-RPKs
UNMAPPED        4887640.0000000000
UniRef50_A0A125UBK5     2135.6903965600
UniRef50_A0A125UBK5|g__Escherichia.s__Escherichia_coli  1652.1739130435
UniRef50_A0A125UBK5|g__Enterococcus.s__Enterococcus_faecalis    483.5164835165
UniRef50_A0A1D7PV13     2135.6122163071
UniRef50_A0A1D7PV13|g__Escherichia.s__Escherichia_coli  2135.6122163071
UniRef50_F4NR79 1762.3762376238
UniRef50_F4NR79|g__Escherichia.s__Escherichia_coli      1762.3762376238
UniRef50_Q54HB2 1585.8948562197
UniRef50_Q54HB2|g__Neisseria.s__Neisseria_sp_oral_taxon_014     464.2864737279
UniRef50_Q54HB2|g__Aggregatibacter.s__Aggregatibacter_segnis    436.5079365079
UniRef50_Q54HB2|g__Neisseria.s__Neisseria_macacae       354.8387096774
UniRef50_Q54HB2|g__Neisseria.s__Neisseria_subflava      124.4813278008
UniRef50_Q54HB2|g__Escherichia.s__Escherichia_coli      105.9552692206
UniRef50_Q54HB2|g__Neisseria.s__Neisseria_canis 66.6666666667
UniRef50_Q54HB2|g__Eikenella.s__Eikenella_corrodens     23.5191637631

annotatebio · December 10, 2024, 2:49pm

Hi! I am also trying to use HUMAnN on the gene table and in need of directions. Unfortunately, the exact format of required TSV file is not well documented in the repository.

I tried to use a snippet of a file you provided above @w_ceasea and yet I get no results matched:

# Pathway	gene_table_Abundance
UNMAPPED	0.0000000000
UNINTEGRATED	0.0000000000

Is it only a coincidental matter of the proteins subset not being annotated in the database, or does HUMAnN in fact require another kind of input table? It is quite confusing to start with, as the description claims gene table, yet the above shows UniRef50 identifiers which refer to proteins.

If I have a collection of UniProtKB identifiers, what would be the best way to go about translating them to UniRef IDs? Can UniRef90 also be used?

Any help will be greatly appreciated, as we do not have sequencing reads at hand and can only use CDS/proteins information.

Topic		Replies	Views
Humann2 input tsv file format HUMAnN	1	284	December 1, 2022
Problem creating a custom DB with KEGG HUMAnN	8	1568	August 3, 2020
HUMAnN: Unable to convert gene table value to float HUMAnN	1	15	February 13, 2025
No pathways detected with --input-format genetable HUMAnN	2	28	January 8, 2025
Humann_unpack_pathways command HUMAnN	4	199	October 18, 2024

Example for `File Type 4 (gene table) tsv` as input to HUMAnN?

Related topics