Annotate predicted gene sequences

pnmns · October 31, 2021, 8:54pm

I have a set of predicted gene sequences in a fasta file and a table with the number of reads mapping to each sequence (or other abundance calculation). Is it possible to use part of humann3 to do the gene annotation and pathway inference? I mostly want to take advantage of the pathway inference and abundance estimation. Thanks!

franzosa · November 5, 2021, 6:23pm

If you annotate your custom genes against either the UniRef90 or UniRef50 database bundled with HUMAnN, then you’d be able to use those genes (and their abundances) to plug into downstream analyses in HUMAnN.

You would need to align the translations of your genes against the UniRef90/50 database using diamond in blastp mode, accepting an alignment as an annotation if it met the UniRef90/50 clustering criteria (80% coverage and 90/50% identity, respectively).

You would also need to convert your existing gene abundances to RPK units by dividing the read counts by the gene lengths in kilobases, since this is the format expected by HUMAnN if you’re starting from previously quantified genes.

Some of the information here might be of use to you:

Topic		Replies	Views
Custom UniRef90 database with Humann3 HUMAnN	4	804	March 15, 2021
Custom databases usage HUMAnN	4	648	October 26, 2023
Using BLAST results against nr database as input for HMAnN 3 HUMAnN	1	337	July 6, 2021
Can Humann be run using proteins HUMAnN	1	159	October 5, 2023
Uniref90 Gene Families to Pathways HUMAnN	1	1018	November 5, 2021

Annotate predicted gene sequences

Related topics