Fix for KeyError: 'mpa_mpa_v30_CHOCOPhlAn_201901.tar'

Hello all!
I installed hopefully the latest version of Metaphlan by running following command:

conda install -c bioconda python=3.7 metaphlan

Next, I decided to run Metaphlan with raw reads and got the error:

Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

File /home/timyerg/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/metaphlan_databases/file_list.txt already present!
Traceback (most recent call last):
  File "/home/timyerg/anaconda3/envs/metaphlan/bin/metaphlan", line 10, in <module>
    sys.exit(main())
  File "/home/timyerg/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 1187, in main
    pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download'])
  File "/home/timyerg/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 610, in check_and_install_database
    download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)
  File "/home/timyerg/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 463, in download_unpack_tar
    url_tar_file = ls_f["mpa_" + download_file_name + ".tar"]
KeyError: 'mpa_mpa_v30_CHOCOPhlAn_201901.tar'

Here I noticed a strange file name in the error: ‘mpa_mpa_v30_CHOCOPhlAn_201901.tar’.
So I opened “/home/timyerg/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/metaphlan.py” file and edited this code on lines 462-468:

tar_file = os.path.join(folder, "mpa_" + download_file_name + ".tar")
    url_tar_file = ls_f["mpa_" + download_file_name + ".tar"]
    download(url_tar_file, tar_file)

    # download MD5 checksum
    md5_file = os.path.join(folder, "mpa_" + download_file_name + ".md5")
    url_md5_file = ls_f["mpa_" + download_file_name + ".md5"]

By replacing it with:

tar_file = os.path.join(folder, download_file_name + ".tar")
    url_tar_file = ls_f[download_file_name + ".tar"]
    download(url_tar_file, tar_file)

    # download MD5 checksum
    md5_file = os.path.join(folder, download_file_name + ".md5")
    url_md5_file = ls_f[download_file_name + ".md5"]

Now database was successfully installed.

I don’t know, maybe I installed not the latest version, and team already fixed it, but if someone encountered the same problem this topic may help him to proceed.

3 Likes

It seems you have installed an older build of MetaPhlAn 3, probably pyh5ca1d4c_2. This error has been corrected in newer builds, the latest one is version 3.0.7

1 Like

Hi, looks like so.
Here the output of metaphlan -v
MetaPhlAn version 3.0 (25 Feb 2020)

Just thought that I was using the latest version from bioconda. I should install it from git.
Thank you for your reply

Hi, I occurred in the same error and was able to resolve thanks to this post.

However, I’m wondering if you plan to update HUMAnN version in conda (either via biobakery or bioconda channels), since for most users it is easier to manage software and dependencies through conda.
Thank you very much.