Humann3/metaphlan 3.0 error md5 checksums do not match

Hi I keep getting this error:

Downloading MetaPhlAn database

Please note due to the size this might take a few minutes

File /Users/curanga/opt/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/file_list.txt already present!

File /Users/curanga/opt/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901.tar already present!

File /Users/curanga/opt/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901.md5 already present!

MD5 checksums do not correspond! If this happens again, you should remove the database files and rerun MetaPhlAn so they are re-downloaded

I even installed the new version described in another discussion post:
pyh5ca1d4c_4 https://anaconda.org/bioconda/metaphlan/3.0/download/noarch/metaphlan-3.0-pyh5ca1d4c_4.tar.bz2 )

To no avail! Is there a fix for this? I have tried uninstalling everything. Running --version on all humann3 tools such as diamond and metaphlan yield good results. Running --version on bowtie however yields the following:

(base) curangalt-osx:humann3 curanga$ bowtie2 --version
/Users/curanga/bowtie2/bowtie2-align-s version 2.4.1
64-bit
Built on
Fri Feb 28 22:21:49 UTC 2020
Compiler: InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Options: -O3 -msse2 -funroll-loops -g3 -mmacosx-version-min=10.9 -DPOPCNT_CAPABILITY -DWITH_TBB -std=c++11 -DNO_SPINLOCK -DWITH_QUEUELOCK=1
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
(base) curangalt-osx:humann3 curanga$

Please help! I am under pressure to get this to work for our lab! Thank you in advance.

Best wishes,
Carla Uranga

Hi Carla,
have you tried to delete the three files and re-running metaphlan --install?

Here is the output from running metaphlan --install (no databases were installed):

curangalt-osx:metaphlan_analysis curanga$ metaphlan --install

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

Downloading file of size: 0.00 MB

0.01 MB 232.33 % 3.05 MB/sec 0 min -0 sec

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAAyoJpOgcjop41VIHAGWIVLa/mpa_latest?dl=1

Downloading file of size: 0.00 MB

0.01 MB 31507.69 % 2.92 MB/sec 0 min -0 sec

Downloading MetaPhlAn database

Please note due to the size this might take a few minutes

File /Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan_databases/file_list.txt already present!

Traceback (most recent call last):

File “/Users/curanga/opt/anaconda3/envs/mpa/bin/metaphlan”, line 10, in

sys.exit(main())

File “/Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan.py”, line 1187, in main

pars[‘index’] = check_and_install_database(pars[‘index’], pars[‘bowtie2db’], pars[‘bowtie2_build’], pars[‘nproc’], pars[‘force_download’])

File “/Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan.py”, line 610, in check_and_install_database

download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)

File “/Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan.py”, line 463, in download_unpack_tar

url_tar_file = ls_f[“mpa_” + download_file_name + “.tar”]

KeyError: ‘mpa_mpa_v30_CHOCOPhlAn_201901.tar’

(mpa) curangalt-osx:metaphlan_analysis curanga$

Hi, do I need a bitbucket.org password to download databases? Metaphlan2 is requiring a password! For humann3/metaphlan3, here is the output from metaphlan --install:

curangalt-osx:metaphlan_analysis curanga$ metaphlan --install

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Downloading file of size: 0.00 MB
0.01 MB 232.33 % 3.05 MB/sec 0 min -0 sec
Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAAyoJpOgcjop41VIHAGWIVLa/mpa_latest?dl=1
Downloading file of size: 0.00 MB
0.01 MB 31507.69 % 2.92 MB/sec 0 min -0 sec
Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

File /Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan_databases/file_list.txt already present!
Traceback (most recent call last):
File “/Users/curanga/opt/anaconda3/envs/mpa/bin/metaphlan”, line 10, in
sys.exit(main())
File “/Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan.py”, line 1187, in main
pars[‘index’] = check_and_install_database(pars[‘index’], pars[‘bowtie2db’], pars[‘bowtie2_build’], pars[‘nproc’], pars[‘force_download’])
File “/Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan.py”, line 610, in check_and_install_database
download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)
File “/Users/curanga/opt/anaconda3/envs/mpa/lib/python3.6/site-packages/metaphlan/metaphlan.py”, line 463, in download_unpack_tar
url_tar_file = ls_f[“mpa_” + download_file_name + “.tar”]
KeyError: ‘mpa_mpa_v30_CHOCOPhlAn_201901.tar’
(mpa) curangalt-osx:metaphlan_analysis curanga$

Hi Carla,
you have installed a very old MetaPhlAn build, still pointing to the old repository.
You should:

  • remove the current mpa conda environment : conda deactivate && conda env remove -n mpa
  • create a new one with python 3.7: conda create -n metaphlan-3.0 python=3.7
  • install MetaPhlAn 3 in the new environment: conda activate metaphlan-3.0 && conda install -c bioconda metaphlan

Ok I tried it but no cigar! Here is the output:

(base) curangalt-osx:humann3 curanga$ conda activate metaphlan-3.0

(metaphlan-3.0) curangalt-osx:humann3 curanga$ conda install -c bioconda metaphlan

Collecting package metadata (current_repodata.json): done

Solving environment: failed with initial frozen solve. Retrying with flexible solve.

Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

Collecting package metadata (repodata.json): done

Solving environment: failed with initial frozen solve. Retrying with flexible solve.

Solving environment: /

Found conflicts! Looking for incompatible packages.

This can take several minutes. Press CTRL-C to abort.

failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

(metaphlan-3.0) curangalt-osx:humann3 curanga$

Can you post the output of conda list from the activated metaphlan-3.0 environment?

Thank you for responding! Here is the output of “conda list”

(metaphlan-3.0) curangalt-osx:humann3 curanga$ conda list

packages in environment at /Users/curanga/opt/anaconda3/envs/metaphlan-3.0:

It looks like that python is not even installed, have you run conda install python=3.7?

(metaphlan-3.0) curangalt-osx:humann3 curanga$ conda install python=3.7

Collecting package metadata (current_repodata.json): done

Solving environment: done

All requested packages already installed.

(metaphlan-3.0) curangalt-osx:humann3 curanga$

This is strange, no packages are installed in the environment, but still conda install says that python3.7 is installed.
Can you post here the output of conda info?

Hi thank you for your patience! I installed that huge package Anaconda Navigator, if there is a better way of installing python please let me know. Here is the output for condo info:

(metaphlan-3.0) curangalt-osx:humann3 curanga$ conda info

active environment : metaphlan-3.0

active env location : /Users/curanga/opt/anaconda3/envs/metaphlan-3.0

shell level : 2

user config file : /Users/curanga/.condarc

populated config files :

conda version : 4.8.3

conda-build version : 3.18.11

python version : 3.7.6.final.0

virtual packages : __osx=10.13.6

base environment : /Users/curanga/opt/anaconda3 (writable)

channel URLs : https://repo.anaconda.com/pkgs/main/osx-64

https://repo.anaconda.com/pkgs/main/noarch

https://repo.anaconda.com/pkgs/r/osx-64

https://repo.anaconda.com/pkgs/r/noarch

package cache : /Users/curanga/opt/anaconda3/pkgs

/Users/curanga/.conda/pkgs

envs directories : /Users/curanga/opt/anaconda3/envs

/Users/curanga/.conda/envs

platform : osx-64

user-agent : conda/4.8.3 requests/2.22.0 CPython/3.7.6 Darwin/17.7.0 OSX/10.13.6

UID:GID : 3737:63

netrc file : None

offline mode : False

(metaphlan-3.0) curangalt-osx:humann3 curanga$

Conda is not configured correctly to fetch packages from bioconda and conda forge. You should run the following commands in the exact order:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

Then, try again from the activated metaphlan-3.0 environment to install again python 3.7 and metaphlan

conda install python=3.7
conda install metaphlan

Check with conda list | grep metaphlan if the installed build (3rd column) is pyh5ca1d4c_4. If not, try to run conda update metaphlan

Ok, almost there! Thank you so much for helping me get this far… Now it is an issue with Bowtie2. Please be aware that I had to install humann with “pip install --no-binary :all:” because conda didn’t work. So how do I include Bowtie2?

Best,

Carla

(metaphlan3.0) curangalt-osx:humann3 curanga$ conda list|grep metaphlan

packages in environment at /Users/curanga/opt/anaconda3/envs/metaphlan3.0:

metaphlan 3.0 pyh5ca1d4c_4 bioconda

(metaphlan3.0) curangalt-osx:humann3 curanga$ humann --version

humann v3.0.0.alpha.2

(metaphlan3.0) curangalt-osx:humann3 curanga$ humann --input demo.fastq --output demo_fastq

Output files will be written to: /Users/curanga/Desktop/humann3/demo_fastq

CRITICAL ERROR: Can not call software version for bowtie2

(metaphlan3.0) curangalt-osx:humann3 curanga$

I have bowtie2 installed, but the vall using --version is odd.

(metaphlan3.0) curangalt-osx:humann3 curanga$ bowtie2 --version

/Users/curanga/opt/anaconda3/envs/metaphlan3.0/bin/bowtie2-align-s version

64-bit

Built on

Sat Jun 6 20:43:32 UTC 2020

Compiler: InstalledDir: /Users/distiller/project/miniconda/conda-bld/bowtie2_1591476031995/_build_env/bin

Options: -O3 -msse2 -funroll-loops -g3 -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -std=c++14 -fmessage-length=0 -isystem /Users/curanga/opt/anaconda3/envs/metaphlan3.0/include -fdebug-prefix-map=/Users/distiller/project/miniconda/conda-bld/bowtie2_1591476031995/work=/usr/local/src/conda/bowtie2-2.4.1 -fdebug-prefix-map=/Users/curanga/opt/anaconda3/envs/metaphlan3.0=/usr/local/src/conda-prefix -DPOPCNT_CAPABILITY -DWITH_TBB -std=c++11 -DNO_SPINLOCK -DWITH_QUEUELOCK=1

Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

Here is the conda list call:

(metaphlan3.0) curangalt-osx:humann3 curanga$ conda list | grep bowtie2

bowtie2 2.4.1 py37h8d6d27b_2 bioconda

Best,

Carla

You can install it using conda but you need to specify the version
conda install bowtie2=2.3.5.1

2.4.1 has the version reporting function broken

Fantastic thank you this worked. However, I am trying to use humann3 for peptide identification. However, I think my fasta file is parsed incorrectly:

1
LLLARKLKLNLLLLLVLAAK
2
DEARRRMRRWWNARHNHPRRRLWVSRMPR
3
HDMCKWGEWWYQYLPTAYYDCMMAK
4
VKLLKKKKKGAKLLVSVCK

Do you have a required fasta format that is required? I would greatly appreciate knowing what this is!

Best,

Carla

#CPU threads: 1

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)

Temporary directory: /Users/curanga/Desktop/humann3/C1peps/C1novor_humann_temp/tmp4bngxi3p

Opening the database… [0s]

Percentage range of top alignment score to report hits: 1

Reference = /Users/curanga/opt/anaconda3/envs/metaphlan3.0/lib/python3.7/site-packages/humann/data/uniref_DEMO/uniref90_demo_prots_v201901.dmnd

Sequences = 44

Letters = 18842

Block size = 2000000000

Opening the input file… [0.001s]

Opening the output file… [0s]

Loading query sequences… [0s]

Error: Error reading input stream at line 2: Invalid character (L) in sequence

HUMAnN assumes you’re starting from nucleotide sequences, hence that first L is raising an eyebrow. You could try just directly searching your peptides against the UniRef90 database using diamond in blastp mode (i.e. protein-to-protein).

Yes, I have been doing this as well and have an m8 file I have been working on, but for some reason I get this error. Do you all offer a detailed manual for humann3 by chance? There seems to be a lot to it! Thank you so much for helping with this.

(metaphlan3.0) curangalt-osx:humann3 curanga$ humann --input C1uniref.m8 --input-format blastm8 --output C1peps
Output files will be written to: /Users/curanga/Desktop/humann3/C1peps

Process the blastm8 mapping results …
Traceback (most recent call last):
File “/Users/curanga/opt/anaconda3/envs/metaphlan3.0/bin/humann”, line 33, in
sys.exit(load_entry_point(‘humann==3.0.0a2’, ‘console_scripts’, ‘humann’)())
File “/Users/curanga/opt/anaconda3/envs/metaphlan3.0/lib/python3.7/site-packages/humann/humann.py”, line 1074, in main
unaligned_reads_store, args.input, alignments)
File “/Users/curanga/opt/anaconda3/envs/metaphlan3.0/lib/python3.7/site-packages/humann/search/translated.py”, line 293, in unaligned_reads
identity_threshold = config.identity_threshold)
File “/Users/curanga/opt/anaconda3/envs/metaphlan3.0/lib/python3.7/site-packages/humann/search/blastx_coverage.py”, line 40, in blastx_coverage
for alignment_info in utilities.get_filtered_translated_alignments(blast6out, alignments, apply_filter=apply_filter, log_filter = log_messages, query_coverage_threshold = query_coverage_threshold, identity_threshold = identity_threshold):
File “/Users/curanga/opt/anaconda3/envs/metaphlan3.0/lib/python3.7/site-packages/humann/utilities.py”, line 1308, in get_filtered_translated_alignments
identity=alignment_info[config.blast_identity_index]
IndexError: list index out of range
(metaphlan3.0) curangalt-osx:humann3 curanga$