Custom databases usage

Sergey_Petrov · March 2, 2021, 2:35pm

Hello

I’ve created custom no uniref-based protein database with arbitrary sequence headers. How can i link it to custom pathway database?

For example:

protein db:

ref_prot1
<…>
ref_prot2
<…>

Pathway db, file 1:
Rxn-1 ref_prot1
Rxn-2 ref_prot2

Pathway db, file 2:
PWS-TEST Rxn-1 Rxn-2
(in accordance with GitHub - biobakery/humann: HUMAnN is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).)

Diamond generates output tsv file with alignment against custom db but humann reports that total gene families count is zero. What am i doing wrong?

franzosa · March 11, 2021, 1:58pm

Are you testing on a very small metagenome? It’s possible that none of your custom proteins are being covered at 50% of sites, in which case HUMAnN will conservatively not report them. You can lower this threshold on the command-line (translated subject coverage threshold) such that any valid read-hit will add its protein to the output file.

Sergey_Petrov · March 11, 2021, 4:39pm

I’ve created nucleotide dataset from 11321 sequences, where each sequence corresponds to some protein. Than i’ve translated them into aminoacid form and created diamond database from gained file.
Then i’ve launched Humann with source fasta file as input data.
The file diamond_aligned.tsv which stores aligning against diamond database shows me that each nucleotide sequence was aligned agaisnt itself in amino form(just as expected). So, each protein in database is full covered by 1 read.

Kang_Da · October 21, 2023, 5:59am

Hi, did you solve the problem? I met the same question, humann3 cannot align against my own protein database, but I can do it use only the diamond.

franzosa · October 26, 2023, 8:21pm

How have you formatted your custom database? Are you seeing an error, or just a lack of meaningful output?

Topic		Replies	Views
Protein database choose and low aligned rate in humann HUMAnN	13	399	September 5, 2023
Using BLAST results against nr database as input for HMAnN 3 HUMAnN	1	337	July 6, 2021
NCycDb, MCycDB, dbCAN2 HUMAnN	1	437	June 29, 2022
Running HUMAnN: pre-computed protein blastx M8 input HUMAnN	8	502	June 8, 2022
Humann3: Run translated search HUMAnN	1	422	June 28, 2022

Custom databases usage

Related topics