The HUMAnN config file is automatically updated to point to the location of both the databases, you have to provide the destination folder to the
humann_databases command, otherwise a default location is used.
Please look at this section of the user manual, it’s reported how to correctly download the databases.
humann_databases --download chocophlan full --update-config yes Getting response like:
usage: humann_databases [-h] [–available]
humann_databases: error: argument --download: expected 3 arguments
metaphlan --install doesn’t directly download all the databases. Getting this error:
Downloading MetaPhlAn database Please note due to the size this might take a few minutes File /home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/file_list.txt already present! Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/biobakery3/bin/metaphlan", line 10, in <module> sys.exit(main()) File "/home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 1187, in main pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download']) File "/home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 610, in check_and_install_database download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc) File "/home/ubuntu/anaconda3/envs/biobakery3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 463, in download_unpack_tar url_tar_file = ls_f["mpa_" + download_file_name + ".tar"] KeyError: 'mpa_mpa_v30_CHOCOPhlAn_201901.tar'
Sorry, my fault, you have to include the target directory, I’d suggest to use a folder extrenal the conda environment.
It seems that you have installed an older version, can you look with
metaphlan --version and
conda list | grep metaphlan which one have you installed? You should try to upgrate to
Hi, I think this might follow in the same thread - I had both UniRef50 and UniRef90 dbs in the same directory, and pointed humann3.0 towards that one directory. The resulting output files contained hits to both UR50 and UR90 databases, and I am wondering if this will produce erroneous results with respect to read counts, or if I can simply parse them and continue on to cpm normalization with each separate UR50 and UR90 output (i.e. did humann ‘know’ to run separately for each db). I tried rerunning and specifying the individual db (e.g. /db/humann3.0/uniref/uniref50_201901.dmnd) rather than the directory in which it was contained (/db/humann3.0/uniref/), and the analyses concluded without producing proper output files, but without producing any errors. So my second question is, if pointing humann towards the protein database via --protein-database, does the filepath have to be only a directory, or can it point toward an actual db?