The bioBakery help forum

Problem creating a custom DB with KEGG


I’ve recently subscribed the KEGG database.
However, I am having trouble creating a humann2 custom database using it.

Especially, during following below part:

$ humann2_build_custom_database --input genes.pep --output custom_database --id-mapping legacy_kegg_idmapping.tsv --format diamond --taxonomic-profile max_taxonomic_profile.tsv

I am just wondering which format should be entered for the part of the ‘genes.pep’ file as input data (Does it need identifier? gene sequences? gene length? or any other things?).
I would appreciate if anyone could answer this question.

genes.pep should be a FASTA file whose sequence headers (i.e. the strings that appear after the > that begin sequence entries) appear in your legacy_kegg_idmapping.tsv file.

Thank you very much, Eric.