The database should be installed using metaphlan --install
. For additional installation options, check the manual here https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0#installation
hi. fbeghini,
I installed metaphlan3, but I can not download the database like the following due to we are not available to the Dropbox.
(py37) [zhangwenping@localhost clean_data_wuqi]$ metaphlan CRR055205_paired.1.fastq CRR055205_paired.2.fastq --input_type fastq -o CRR055205_paired.txt
Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Traceback (most recent call last):
File “/data/xmjd/miniconda2/envs/py37/bin/metaphlan”, line 10, in
sys.exit(main())
File “/data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/metaphlan.py”, line 916, in main
pars[‘index’] = check_and_install_database(pars[‘index’], pars[‘bowtie2db’], pars[‘bowtie2_build’], pars[‘nproc’], pars[‘force_download’])
File “/data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/init.py”, line 269, in check_and_install_database
index = resolve_latest_database(bowtie2_db, ls_f[‘mpa_latest’], force_redownload_latest)
UnboundLocalError: local variable ‘ls_f’ referenced before assignment
Can you tell me how to get the metaphlan_database?
thank you!
Wenping
I’ve uploaded the very same files on Zenodo, you can get them from this link.
In the next days I’ll update the code in order to download the database from there.
Thanks,I downloaded these data from here.
Hi,fbeghini,
I download the database from Zenodo and run the code “bowtie2-build mpa_v30_CHOCOPhlAn_201901.fna mpa_v30_CHOCOPhlAn_201901” inside /data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/metaphlan_databases:
Settings:
Output files: "mpa_v30_CHOCOPhlAn_201901..bt2"
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
mpa_v30_CHOCOPhlAn_201901.fna
Building a SMALL index
Reading reference sizes
Time reading reference sizes: 00:00:09
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:10
bmax according to bmaxDivN setting: 299330357
Using parameters --bmax 224497768 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 224497768 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:30
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:08
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:17
Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
Splitting and merging
Splitting and merging time: 00:00:00
Avg bucket size: 1.19732e+09 (target: 224497767)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 1197321429 for bucket 1
(Using difference cover)
Sorting block time: 00:13:28
Returning block of 1197321430 for bucket 1
Exited Ebwt loop
fchr[A]: 0
fchr[C]: 308036136
fchr[G]: 602056561
fchr[T]: 916319996
fchr[$]: 1197321429
Exiting Ebwt::buildToDisk()
Returning from initFromVector
Wrote 629609227 bytes to primary EBWT file: mpa_v30_CHOCOPhlAn_201901.1.bt2
Wrote 299330364 bytes to secondary EBWT file: mpa_v30_CHOCOPhlAn_201901.2.bt2
Re-opening _in1 and _in2 as input streams
Returning from Ebwt constructor
Headers:
len: 1197321429
bwtLen: 1197321430
sz: 299330358
bwtSz: 299330358
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 74832590
offsSz: 299330360
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 6236050
numLines: 6236050
ebwtTotLen: 399107200
ebwtTotSz: 399107200
color: 0
reverse: 0
Total time for call to driver() for forward index: 00:18:13
Reading reference sizes
Time reading reference sizes: 00:00:07
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:09
Time to reverse reference sequence: 00:00:01
bmax according to bmaxDivN setting: 299330357
Using parameters --bmax 224497768 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 224497768 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:29
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:08
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:18
Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
Splitting and merging
Splitting and merging time: 00:00:00
Avg bucket size: 1.19732e+09 (target: 224497767)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 1197321429 for bucket 1
(Using difference cover)
Sorting block time: 00:13:29
Returning block of 1197321430 for bucket 1
Exited Ebwt loop
fchr[A]: 0
fchr[C]: 308036136
fchr[G]: 602056561
fchr[T]: 916319996
fchr[$]: 1197321429
Exiting Ebwt::buildToDisk()
Returning from initFromVector
Wrote 629609227 bytes to primary EBWT file: mpa_v30_CHOCOPhlAn_201901.rev.1.bt2
Wrote 299330364 bytes to secondary EBWT file: mpa_v30_CHOCOPhlAn_201901.rev.2.bt2
Re-opening _in1 and _in2 as input streams
Returning from Ebwt constructor
Headers:
len: 1197321429
bwtLen: 1197321430
sz: 299330358
bwtSz: 299330358
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 74832590
offsSz: 299330360
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 6236050
numLines: 6236050
ebwtTotLen: 399107200
ebwtTotSz: 399107200
color: 0
reverse: 1
Total time for backward call to driver() for mirror index: 00:18:12
Due to the limitation of /home disk, I installed conda and metaphlan in /data disk. I run the code “bowtie2 --sam-no-hd --sam-no-sq --no-unal --very-sensitive -S metagenome.sam -x /data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901 -U /data/liying_metagenome/clean_data_liying/SRR5130527_paired.1.fastq” to get the sam file,.
However, when I run the code "
metaphlan metagenome.sam --input_type sam -o profiled_metagenome.txt", some errors as following:
(py37) [zhangwenping@localhost try_humann]$ metaphlan metagenome.sam --input_type sam -o profiled_metagenome.txt
Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Traceback (most recent call last):
File “/data/xmjd/miniconda2/envs/py37/bin/metaphlan”, line 10, in
sys.exit(main())
File “/data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/metaphlan.py”, line 916, in main
pars[‘index’] = check_and_install_database(pars[‘index’], pars[‘bowtie2db’], pars[‘bowtie2_build’], pars[‘nproc’], pars[‘force_download’])
File “/data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/init.py”, line 269, in check_and_install_database
index = resolve_latest_database(bowtie2_db, ls_f[‘mpa_latest’], force_redownload_latest)
UnboundLocalError: local variable ‘ls_f’ referenced before assignment
what should I do to fix the errors?
Thank you very much!
Wenping
Hi Wenping,
just launch metaphlan with -x mpa_v30_CHOCOPhlAn_201901 --bowtie2db /data/xmjd/miniconda2/envs/py37/lib/python3.7/site-packages/metaphlan/metaphlan_databases/
and it will avoid download the files, if you specify the index name and the database is built, it will automatically use it.
thank you very much. Following your help, I fixed the errors.