Issue with Unable to download metaphlan_debases

hi
I’m trying to use Metaphlan2 to look at the microbiome abundance in several samples, but when I try to run them I get the following error ,at first ,I use command: metaphlan2.py /pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq --bowtie2out SRR7828865.bowtie2.bz2 --nproc 4 --input_type fastq > profiled_SRR7828865.txt

Downloading MetaPhlAn2 database
Please note due to the size this might take a few minutes

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Traceback (most recent call last):
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 1580, in
metaphlan2()
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 1373, in metaphlan2
check_and_install_database(pars[‘index’], pars[‘bowtie2db’], pars[‘bowtie2_build’], pars[‘nproc’], pars[‘offline’])
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 842, in check_and_install_database
download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 749, in download_unpack_tar
url_tar_file = ls_f[“mpa_” + download_file_name + “.tar”]
UnboundLocalError: local variable ‘ls_f’ referenced before assignment

it may cannot access to dropbox,then I have downloaded databases manually,and use caommand:
metaphlan2.py /pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq --bowtie2_exe /pub/yuanjian/lsd/Biosoft/bowtie -x /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/mpa_v20_m200 --bowtie2db /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/ --bowtie2out SRR7828865.bowtie2.bz2 --nproc 4 --input_type fastq > profiled_SRR7828865.txt
then:
Downloading MetaPhlAn2 database
Please note due to the size this might take a few minutes
File /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/file_list.txt already present!
Traceback (most recent call last):
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 1580, in
metaphlan2()
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 1373, in metaphlan2
check_and_install_database(pars[‘index’], pars[‘bowtie2db’], pars[‘bowtie2_build’], pars[‘nproc’], pars[‘offline’])
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 842, in check_and_install_database
download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)
File “/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan2.py”, line 749, in download_unpack_tar
url_tar_file = ls_f[“mpa_” + download_file_name + “.tar”]
KeyError: ‘mpa_/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/mpa_v20_m200.tar’
Any help would be greatly appreciated.

Have you built the bowtie2 indexes from mpa_v20_m200.fna?

hi, fbeghini,
I have built the bowtie2 indexes from mpa_v20_m200.fna:
yuanjian@localhost:/pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases$ bowtie2-build --threads 4 mpa_v20_m200.fna mpa_v20_m200

I’ve added the --offline flag in order to avoid downloading the database, also if you specify -x mpa_v20_m200 it should not download anything and just use the manually built database. Have you tried using these parameters?

I add the parameters --offline and the command is metaphlan2.py /pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq --offline --bowtie2_exe /pub/yuanjian/lsd/Biosoft/bowtie -x /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/mpa_v20_m200 --bowtie2db /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/ --bowtie2out SRR7828865.bowtie2.bz2 --nproc 4 --input_type fastq > profiled_SRR7828865.txt
then ,it shows as following:
Warning! Biom python library not detected!
Exporting to biom format will not work!
No database files found in /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/. Exiting.

Can you post the output of ls -l /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases ?

You should use only -x mpa_v20_m200, -x specifies the name of the database located under --bowtie2db

hi,fbeghini
I have tried what you adviced,and the command is metaphlan2.py /pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq --offline --bowtie2_exe /pub/yuanjian/lsd/Biosoft/bowtie -x mpa_v20_m200 --bowtie2db /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/ --bowtie2out SRR7828865.bowtie2.bz2 --nproc 4 --input_type fastq > profiled_SRR7828865.txt ,but it shows the same error:
Warning! Biom python library not detected!
Exporting to biom format will not work!
No database files found in /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/. Exiting.
:sob:

My bad, sorry, I thought you were using MetaPhlan3. You need to specify just v20_m200 with -x

yeah,I am using MetaPhlan2,version 2.8.1,and I do specify just just v20_m200 with -x:metaphlan2.py /pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq --offline --bowtie2_exe /pub/yuanjian/lsd/Biosoft/bowtie -x /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/mpa_v20_m200 --bowtie2out SRR7828865.bowtie2.bz2 --nproc 4 --input_type fastq > profiled_SRR7828865.txt
And it still shows like this :
Warning! Biom python library not detected!
Exporting to biom format will not work!
No database files found in /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases. Exiting.
I do not know what is wrong with it.

The command you used is wrong, as I mentioned you in the previous post, -x stores only the database name and not the path of the directory containing it. The correct command line is

metaphlan2.py 
/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq 
--offline --bowtie2_exe /pub/yuanjian/lsd/Biosoft/bowtie 
-x v20_m200 
--bowtie2db /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases 
--bowtie2out SRR7828865.bowtie2.bz2 
--nproc 4 --input_type fastq > profiled_SRR7828865.txt

hi,fbeghini
Sorry to bother you again ,I have tried the command you suggested, then it shows the new error:

Traceback (most recent call last):
File “/pub/yuanjian/lsd/Biosoft/metaphlan/utils/read_fastx.py”, line 9, in
from Bio import SeqIO
ModuleNotFoundError: No module named ‘Bio’
OSError: fatal error running ‘/pub/yuanjian/lsd/Biosoft/metaphlan/utils/read_fastx.py’. Is it in the system path?
The error showed before when used command line:
metaphlan2.py /pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_1.fastq,/pub/yuanjian/lsd/Biosoft/bowtie/alignment_out/unaligned_fq/SRR7828865_unaligned_R_2.fastq --bowtie2_exe /pub/yuanjian/lsd/Biosoft/bowtie --bowtie2db /pub/yuanjian/lsd/Biosoft/metaphlan/metaphlan_databases/MetaPhlAn_databases/ --bowtie2out SRR7828865.bowtie2.bz2 --nproc 4 --input_type fastq > profiled_SRR7828865.txt

How MetaPhlAn was installed? Have you installed all the dependencies?
The command line used is correct but the program fails to run since biopython is missing and read_fastx.py (an utility script used to parse the metagenome) is not found in the path.

Sorry,I forgot the way that installed MetaPhlAn2,maybe by cloning the repository using the following command:$ git clone https://github.com/biobakery/metaphlan
I have installed biopython just now and rerun the command ,but go wrong with bowtie2,like this:OSError: “[Errno 13] Permission denied: ‘/pub/yuanjian/lsd/Biosoft/bowtie’”
Fatal error running BowTie2. Is BowTie2 in the system path?
Bowtie2 is already in the path:bowtie2: /pub/yuanjian/lsd/Biosoft/bowtie/bowtie2-2.4.1-linux-x86_64/bowtie2

If the folder is not present in echo $PATH you can tell MetaPhlAn to use that executable by using the --bowtie2_exe parameter.

Thank you so much!!! The problem has been solved.It works!!!

hi fbeghini
Sorry to bother you again.
Now I meet a new problem,and how can I solve the problem above when I used humann2 to analyse metagenome functional profile.Beacause I see it would run metaphlan2.py automatically first !
Looking forward to your reply!
813dfa28ad6493fbcbc24cc817df40b|690x166

The first step run by HUMAnN2 is MetaPhlAn2 in order to detect the species profilable. In order to use it from HUMAnN ou need to update your MetaPhlAn2 install and install the latest build (2.8.1) from GitHub and run HUMAnN2 with the option --metaphlan-options "--offline". If you already have the MetaPhlAn2 profile for the sample, you can use the HUMAnN2’s --taxonomic-profile <path to MetaPhlAn2 profile> option.

Thank you so much ,I will have a try!