Dear Metaphlan developers,
I have problems downloading and using MetaPhlAn-4 database.
I downloaded Metaphlan using conda, then I tried to install the database using:
metaphlan --install--bowtie2db [my folder]
I then got an error:
Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_latest
Downloading file of size: 0.00 MB
0.01 MB 25600.00 % 43.29 MB/sec 0 min -0 sec
Downloading MetaPhlAn database
Please note due to the size this might take a few minutes
Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103.tar
Downloading file of size: 2623.07 MB
2623.05 MB 100.00 % 261.05 MB/sec 0 min 0 sec
Warning: Unable to download http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103.tar
Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103.md5
Downloading file of size: 0.00 MB
MD5 checksums do not correspond! If this happens again, you should remove the database files and rerun MetaPhlAn so they are re-downloaded
After trying multiple times, I downloaded the files manually from
Index of /biobakery4/metaphlan_databases, extracted the files, and got four files:
mpa_vJan21_CHOCOPhlAnSGB_202103.pkl
mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.fna.bz2
mpa_vJan21_CHOCOPhlAnSGB_202103_VINFO.csv
mpa_vJan21_CHOCOPhlAnSGB_202103_VSG.fna.bz2
I built the larger one (SGB) with bowtie2.
The output is as follows:
./mpa_vJan21_CHOCOPhlAnSGB_202103.md5
./mpa_vJan21_CHOCOPhlAnSGB_202103.pkl
./mpa_vJan21_CHOCOPhlAnSGB_202103.tar
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.1.bt2l
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.2.bt2l
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.3.bt2l
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.4.bt2l
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.fna
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.rev.1.bt2l
./mpa_vJan21_CHOCOPhlAnSGB_202103_SGB.rev.2.bt2l
./mpa_vJan21_CHOCOPhlAnSGB_202103_VINFO.csv
./mpa_vJan21_CHOCOPhlAnSGB_202103_VSG.fna
./mpa_vJan21_CHOCOPhlAnSGB_202103_marker_info.txt
./mpa_vJan21_CHOCOPhlAnSGB_202103_species.txt
I then used this code:
metaphlan example.fq.gz --input_type fastq -o example.txt --index mpa_vJan21_CHOCOPhlAnSGB_202103_SGB --bowtie2db ~/metaphlan4_db/
I get an error:
Error: Unable to find the mpa_pkl file at: mpa_pklExiting...
Maybe the problem is that the name of the pkl file is not similar to the index I used (mine has a suffix of _SGB), but by removing the suffix, it doesn’t recognize the database at all.
Also, why couldn’t I install the database in the first place?
I’d be happy for your help.
Thanks in advance!