How to install the latest metaphlan reference DB?

Hi there:

The metaphlan software is very cool, which is very helpful to me. Here,I have a question about the ‘mpa_vJan21_CHOCOPhlAnSGB_202103’ installation. After I downloaded the ‘mpa_vjan21_chocophlansGB_202103.tar’ to the specified location, useing the github command " metaphlan --install --index mpa_vJan21_CHOCOPhlAnSGB_202103 -- Bowtie2db <The database folder> " to build the Metaphlan database.But it does not work:


The installation address of the Metaphlan database is as follows:

So I delete ‘mpa_vjan21_chocophlansGB_202103.fna’ ,and run the metaphlan --install command again:

But metaphlan does not generate the .bt2 file(like mpa_v31_CHOCOPhlAn_201901.1.bt2),so how to build the mpa_vJan21_CHOCOPhlAnSGB_202103 (the latest metaphlan reference DB) ?

Any help would be appreciated!
Thanks!

Why did it create mpa_v31_CHOCOPhlAn_201901.1.bt2 for you?
For me it is still installing mpa_v30_CHOCOPhlAn_201901
because that is set in: http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_latest

The version that you mentioned I found here:
http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/

Hi @shengxinf ,
Thanks for getting in touch. The code to use the mpa_vJan21 version of the MetaPhlAn database (i.e MetaPhlAn 4) is not available yet, we hope to release the code in the following weeks. In the meanwhile, the last stable version of MetaPhlAn is mpa_v3.0. For a correct installation of this version, you should just run metaphlan --install

I have instaled in a conda environment MetaPhlAn version 4.0.2 (22 Sep 2022) and humann v3.5. I try to run the code

humann --input demo.fastq.gz --output demo_fastq --threads 4

but I get the same error “No MetaPhlAn BowTie2 database found (–index option)!” with diferent metaphlan databases: mpa_v30_CHOCOPhlAn_201901
mpa_v31_CHOCOPhlAn_201901

In case of mpa_vJan21_CHOCOPhlAnSGB_202103 metaphlan --install create bt2l.tmp files instead of bt2.

If I downgrade to metaphlan 3.1 and humann 3.1 problems disapeared

Hi @imontero
The bowtie2 database for mpa_vJan21 is a large database, so the output format is bt2l instead of bt2. If you were still seeing the .tmp files, it might mean that you didn’t finish building the database correctly.
My suggestion would be to remove the mpa_vJan21_CHOCOPhlAnSGB_202103
files and try to redownload the db again with metaphlan --force_download

metaphlan --install --force_download --bowtie2db ./ --nproc 4

Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_latest
Downloading file of size: 0.00 MB
0.01 MB 25600.00 % 130.03 MB/sec 0 min -0 sec
Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103.tar
Downloading file of size: 2623.07 MB
2623.07 MB 100.00 % 12.46 MB/sec 0 min -0 sec
Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103.md5
Downloading file of size: 0.00 MB
0.01 MB 11702.86 % 64.38 MB/sec 0 min -0 sec

Decompressing ./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.fna.bz2 into ./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.fna

Decompressing ./mpa_vJan21_CHOCOPhlAnSGB_202103_VSG.fna.bz2 into ./mpa_vJan21_CHOCOPhlAnSGB_202103_VSG.fna

Joining FASTA databases

Building Bowtie2 indexes
Fatal error running ‘bowtie2-build --quiet --threads 4 -f ./mpa_vJan21_CHOCOPhlAnSGB_202103.fna ./mpa_vJan21_CHOCOPhlAnSGB_202103’
Error message: ‘Command ‘[‘bowtie2-build’, ‘–quiet’, ‘–threads’, ‘4’, ‘-f’, ‘./mpa_vJan21_CHOCOPhlAnSGB_202103.fna’, ‘./mpa_vJan21_CHOCOPhlAnSGB_202103’]’ returned non-zero exit status 247.’

Hi @imontero
By your error, it seems you ran out of resources (probably RAM) when generating the indexes. MetaPhlAn 4 database significantly increased with respect to version 3 and typically requires around 16-20GB or RAM

Thank you for your advice. It is strage because my PC has 32Gb of RAM, however I will upgrade my PC to 64GB soon. I will try them.

I also ran into the same sudden error when executing bowtie2-build for MetaPhlAn4 with the process exit status 247.
I had around 28 GB of RAM available for a virtual machine, though suddenly the process terminated.
Bowtie2 has not finished and left me with the following files:

(base) bernhard@system metaphlan_db % ls -lah
total 109544408
drwxr-xr-x  30 bernhard  staff   960B 24 Okt 13:03 .
drwxr-xr-x@ 10 bernhard  staff   320B 22 Okt 22:05 ..
-rw-r--r--@  1 bernhard  staff    10K 22 Okt 23:44 .DS_Store
-rw-r--r--   1 bernhard  staff    32B 22 Okt 22:06 mpa_latest
-rw-r--r--   1 bernhard  staff   4,1G 24 Okt 04:45 mpa_vJan21_CHOCOPhlAnSGB_202103.1.bt2l.tmp
-rw-r--r--   1 bernhard  staff   4,4G 24 Okt 04:32 mpa_vJan21_CHOCOPhlAnSGB_202103.2.bt2l.tmp
-rw-r--r--   1 bernhard  staff    85M 23 Okt 16:46 mpa_vJan21_CHOCOPhlAnSGB_202103.3.bt2l.tmp
-rw-r--r--   1 bernhard  staff   2,2G 23 Okt 16:46 mpa_vJan21_CHOCOPhlAnSGB_202103.4.bt2l.tmp
-rw-r--r--   1 bernhard  staff    10G 23 Okt 13:07 mpa_vJan21_CHOCOPhlAnSGB_202103.fna
-rw-r--r--   1 bernhard  staff    70B 22 Okt 22:22 mpa_vJan21_CHOCOPhlAnSGB_202103.md5
-rw-rw-r--   1 bernhard  staff    53M  1 Apr  2022 mpa_vJan21_CHOCOPhlAnSGB_202103.pkl
-rw-r--r--   1 bernhard  staff   1,7G 24 Okt 13:06 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.1.bt2l.tmp
-rw-r--r--   1 bernhard  staff   2,3G 24 Okt 13:06 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.2.bt2l.tmp
-rw-r--r--   1 bernhard  staff   2,2G 24 Okt 11:33 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.24.sa
-rw-r--r--   1 bernhard  staff   2,2G 24 Okt 12:22 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.25.sa
-rw-r--r--   1 bernhard  staff   1,8G 24 Okt 12:26 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.26.sa
-rw-r--r--   1 bernhard  staff   1,7G 24 Okt 12:38 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.27.sa
-rw-r--r--   1 bernhard  staff   2,0G 24 Okt 13:01 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.28.sa
-rw-r--r--   1 bernhard  staff   1,4G 24 Okt 12:40 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.29.sa
-rw-r--r--   1 bernhard  staff   1,0G 24 Okt 13:06 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.30.sa
-rw-r--r--   1 bernhard  staff   918M 24 Okt 13:05 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.31.sa
-rw-r--r--   1 bernhard  staff   667M 24 Okt 13:06 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.32.sa
-rw-r--r--   1 bernhard  staff   568M 24 Okt 13:06 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.33.sa
-rw-r--r--   1 bernhard  staff   133M 24 Okt 13:06 mpa_vJan21_CHOCOPhlAnSGB_202103.rev.34.sa
-rw-r--r--@  1 bernhard  staff   2,6G 22 Okt 22:22 mpa_vJan21_CHOCOPhlAnSGB_202103.tar
-rw-r--r--   1 bernhard  staff   9,2G 23 Okt 11:33 mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.fna
-rw-rw-r--   1 bernhard  staff    43K 11 Jun  2021 mpa_vJan21_CHOCOPhlAnSGB_202103_VINFO.csv
-rw-r--r--   1 bernhard  staff   841M 23 Okt 10:39 mpa_vJan21_CHOCOPhlAnSGB_202103_VSG.fna
-rw-r--r--@  1 bernhard  staff    29M 22 Aug 16:51 mpa_vJan21_CHOCOPhlAnSGB_202103_marker_info.txt.bz2
-rw-r--r--@  1 bernhard  staff   380K 25 Aug 10:25 mpa_vJan21_CHOCOPhlAnSGB_202103_species.txt.bz2

I guess it is better to stick with the older MetaPhlAn3 version then or move to a system with more RAM.
Best regards, Bernhard

Had the same issue. 32 Gb on the remote cluster was not enough. Allocated 64 GB and it worked (probably you don’t need 64 :grinning: ).

Yes, building the new MetaPhlAn 4 database requires a great amount of RAM. We are currently working on making the pre-build database available to download to avoid this inconvenience. We will keep you posted

1 Like

I uploaded a precomputed version of the bt2 database here: http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/bowtie2_indexes/mpa_vJan21_CHOCOPhlAnSGB_202103_bt2.tar
Let me know if you have any problems with it