Uniref90 and MetaCyc database versions and dates

Hi guys!
First of all, thank you for making an amazing tool for functional profiling of metagenomes. HUMAaN2 is awesome. I had a question about database versions. So, when I download the database using humann2_databases script, it downloads this version of the database “//huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref_annotated/uniref90_annotated_1_1.tar.gz”. However, it seems like this file has been last updated in 2016. So do you guys update your database periodically? And if i want to cite this version of the uniref90 database that has been used for my analysis in a paper, which version number or date should I mention. Can you please inform me on that part?
Lastly, the current metacyc database version number is v23.5 released in Dec, 19. So is Humann2 using the recent version or an older version. Can you please let me know the version number and date of Metacyc’s mapping file in your package? Thank you once again for the amazing pipeline! Will be waiting to hear back from you guys then

HUMAnN 2.0 is based on UniProt/UniRef 2014_07 and MetaCyc 19.1. HUMAnN 3.0 (currently in alpha release) is based on UniProt/UniRef 2019_01 and MetaCyc 19.1; we intend to update the MetaCyc pathways for the final 3.0 release.

Thank you very much for the reply. Appreciate it! Looking forward to the updated databases in HUMAnN 3.0.

Do you mean UniRef 2014_01? I don’t see any 2014_07 release on the uniprot/uniref ftp site: ftp://ftp.uniprot.org/pub/databases/uniprot/previous_releases/

It looks like they only have yearly releases archived before 2015. What do you need from 2014_07? We still have the original sequence files and could host/share them for download.

I’d like to have that release so that I can map some data from the GTDB to those UniRef IDs. It would be great if you could make those sequences available, at least for UniRef50

Congrats on HUMAnN 3.0! The same goes for you PhyloPhlAn 3.0 paper. I just wanted to follow up on this. For the time being, we will still be using HUMAnN2, so it would be great to get the data fro 2014_07

Thank you and confirmed - I haven’t forgotten. In the middle of teaching a workshop this week but will look into getting those hosted when we’re done.

Sounds good. Thanks for your quick reply! Good luck with the workshop!

@nick-youngblut As promised (if late) here are the FASTAs for UniRef90/50 v2014_07 as used in HUMAnN 2.0:

http://huttenhower.sph.harvard.edu/humann2_data/uniref_201407/uniref90_201407_annotated.fasta.gz (5.7 GB)

http://huttenhower.sph.harvard.edu/humann2_data/uniref_201407/uniref50_201407_annotated.fasta.gz (2.4 GB)

Thank you so much for the sequence data!

Hi @franzosa , Is there any update to MetaCyc database version 19.1 to the most recent ones? Also when will the post alpha version of Humann3 be released? Thanks!

We’ve been post-alpha for a while now! :slight_smile: Please see the post below for details. Version 3.0.0 added MetaCyc 24.0 and it is now the default pathway set in v3.0.1.