Annotate predicted gene sequences

I have a set of predicted gene sequences in a fasta file and a table with the number of reads mapping to each sequence (or other abundance calculation). Is it possible to use part of humann3 to do the gene annotation and pathway inference? I mostly want to take advantage of the pathway inference and abundance estimation. Thanks!

If you annotate your custom genes against either the UniRef90 or UniRef50 database bundled with HUMAnN, then you’d be able to use those genes (and their abundances) to plug into downstream analyses in HUMAnN.

You would need to align the translations of your genes against the UniRef90/50 database using diamond in blastp mode, accepting an alignment as an annotation if it met the UniRef90/50 clustering criteria (80% coverage and 90/50% identity, respectively).

You would also need to convert your existing gene abundances to RPK units by dividing the read counts by the gene lengths in kilobases, since this is the format expected by HUMAnN if you’re starting from previously quantified genes.

Some of the information here might be of use to you:

1 Like