Describe the bug
MetaPhlAn totally ignores the installed Bowtie2 database and tries (and fails) to re-download one. This behavior has been duplicated (by me) using multiple databases, on both a cluster running Linux Alpine, and also with the official MetaPhlAn docker images.
- Make a folder called
my-awesome-dataand in it, place corresponding R1 and R2 fastqz files, e.g.,
- Install the docker image:
docker pull biobakery/metaphlan
- Run the docker image with the provided files, like so (replacing
/path/to/mewith the result of
sudo docker run -it --rm \ -v /path/to/me/my-awesome-data:/data \ biobakery/metaphlan metaphlan \ /data/AP0F_ST2_MetaAir_S451_L004_R1_001.trimmed.fastq.gz, \ /data/AP0F_ST2_MetaAir_S451_L004_R2_001.trimmed.fastq.gz \ --input_type fastq \ --bowtie2db /usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103 \ --bowtie2out /data/bowtie-output \ --nproc 25 \ -o /data/arbitrary-output.txt
And observe as it totally ignores the installed databases in
/usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103 and instead tries and fails to reinstall them:
Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_latest Downloading file of size: 0.00 MB 0.01 MB 25600.00 % 31.36 MB/sec 0 min -0 sec Downloading MetaPhlAn database Please note due to the size this might take a few minutes Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vOct22_CHOCOPhlAnSGB_202212.tar Downloading file of size: 2884.91 MB 2884.91 MB 100.00 % 8.55 MB/sec 0 min -0 sec Downloading http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vOct22_CHOCOPhlAnSGB_202212.md5 Downloading file of size: 0.00 MB 0.01 MB 11702.86 % 45.57 MB/sec 0 min -0 sec Decompressing /usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103/mpa_vOct22_CHOCOPhlAnSGB_202212_VSG.fna.bz2 into /usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103/mpa_vOct22_CHOCOPhlAnSGB_202212_VSG.fna Decompressing /usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103/mpa_vOct22_CHOCOPhlAnSGB_202212_SGB.fna.bz2 into /usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103/mpa_vOct22_CHOCOPhlAnSGB_202212_SGB.fna Joining FASTA databases Building Bowtie2 indexes Removing uncompressed databases Download complete No MetaPhlAn BowTie2 database found (--index option)! Expecting location /usr/local/lib/python3.6/dist-packages/metaphlan/metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103/mpa_vOct22_CHOCOPhlAnSGB_202212
Platform (please complete the following information):
- Version is the latest image on Docker Hub. But I also replicated this behavior installing via conda and pip (I tried both!) on Alpine Linux. In the Docker image, I get:
(base) max@max-XPS-13-9310:~/projects/emily/get-phyloflash-to-work$ sudo docker run -it --rm biobakery/metaphlan metaphlan --version MetaPhlAn version 4.0.2 (22 Sep 2022)
- Download source is Docker Hub but also I tried both pip and conda.
Happy to provide additional context if needed, but seeing as how the error can be directly reproduced using the publicly available Docker image, I think this should be sufficient.
I also reported this on Github Issues, but, it was auto-closed. When directed to post here I noticed you do not have any tag for bug reports. This should be rectified, IMHO. That being said, thank you for the help! We appreciate it.