Where to install HUMAnN database?

Hi @franzosa @fbeghini - I am trying to install HUMAnN3.0. In which directory should I put the database? In the HUMAnN3.0 tutorial page the directory has not been specified.


The HUMAnN config file is automatically updated to point to the location of both the databases, you have to provide the destination folder to the humann_databases command, otherwise a default location is used.

Please look at this section of the user manual, it’s reported how to correctly download the databases.

Using command:
humann_databases --download chocophlan full --update-config yes Getting response like:

usage: humann_databases [-h] [–available]
[–download <install_location>]
[–update-config {yes,no}]
humann_databases: error: argument --download: expected 3 arguments

Also, metaphlan --install doesn’t directly download all the databases. Getting this error:

Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

File /home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/file_list.txt already present!
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/biobakery3/bin/metaphlan", line 10, in <module>
  File "/home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 1187, in main
    pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download'])
  File "/home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 610, in check_and_install_database
    download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)
  File "/home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 463, in download_unpack_tar
    url_tar_file = ls_f["mpa_" + download_file_name + ".tar"]
KeyError: 'mpa_mpa_v30_CHOCOPhlAn_201901.tar'

Sorry, my fault, you have to include the target directory, I’d suggest to use a folder extrenal the conda environment.

It seems that you have installed an older version, can you look with metaphlan --version and conda list | grep metaphlan which one have you installed? You should try to upgrate to 3.0.2.

Hi, I think this might follow in the same thread - I had both UniRef50 and UniRef90 dbs in the same directory, and pointed humann3.0 towards that one directory. The resulting output files contained hits to both UR50 and UR90 databases, and I am wondering if this will produce erroneous results with respect to read counts, or if I can simply parse them and continue on to cpm normalization with each separate UR50 and UR90 output (i.e. did humann ‘know’ to run separately for each db). I tried rerunning and specifying the individual db (e.g. /db/humann3.0/uniref/uniref50_201901.dmnd) rather than the directory in which it was contained (/db/humann3.0/uniref/), and the analyses concluded without producing proper output files, but without producing any errors. So my second question is, if pointing humann towards the protein database via --protein-database, does the filepath have to be only a directory, or can it point toward an actual db?