Metaphlan4 --mpa3 --add_viruses failed

Hi all! I am trying to run MetaPhlan 4 by adding parameters: --add_viruses and --mpa3 to allow the profiling of viral organisms, and then it returns the error: No MetaPhlAn BowTie2 database found (–index option)! But if I remove these two parameters, it runs smoothly. Here I attach my command, the error, and the bowtie2 database I used.

  1. my commands:
metaphlan  $FQ_DIRECTORY/$SAMPLE.fastq.gz --nproc 16 --read_min_len 30 --input_type fastq --bowtie2out $SAMPLE.bowtie2.out -o $SAMPLE.metaphlan4.txt --bowtie2db /Databases/Metaphlan4 --mpa3 --add_viruses 
  1. the error:
No MetaPhlAn BowTie2 database found (--index option)!
Expecting location bowtie2db
  1. Those files are within the Bowtie2 database.
mpa_vJan21_CHOCOPhlAnSGB_202103.4.bt2l      mpa_vJan21_CHOCOPhlAnSGB_202103.rev.2.bt2l
mpa_vJan21_CHOCOPhlAnSGB_202103.1.bt2l  mpa_vJan21_CHOCOPhlAnSGB_202103.md5         mpa_vJan21_CHOCOPhlAnSGB_202103.tar
mpa_vJan21_CHOCOPhlAnSGB_202103.2.bt2l  mpa_vJan21_CHOCOPhlAnSGB_202103.pkl         mpa_vJan21_CHOCOPhlAnSGB_202103_VINFO.csv
mpa_vJan21_CHOCOPhlAnSGB_202103.3.bt2l  mpa_vJan21_CHOCOPhlAnSGB_202103.rev.1.bt2l  mpa_vJan21_CHOCOPhlAnSGB_202103_VSG.fna

Really appreciate your help and time! Thanks in advance!

Hi @Nirvana
Currently, the metaphlan4 vJan21 database do not profile viral species. For using the --add_viruses parameter (together with the --mpa3 parameter) you should use the metaphlan 3.0 (or 3.1) database available here: Index of /biobakery3/metaphlan_databases

Is this still the case with Metaphlan 4.0.5? And if not, what is the timeline for adding them back to the database? The removal of virus sequences was omitted from the changelog and the Metaphlan4 announcement. The manuscript implies that they are present but not extensively represented, rather than entirely absent:

The current methods also do not extensively incorporate viral or eukaryotic microbial sequences, due to their unique genomic architectures and quality control requirements relative to bacterial and archaeal genomes.

I now need to downgrade and reprocess several hundred samples.

Hi @nickp60
We apologise for the inconvenience. In (any) version 4, we had decided not to include the previous viral markers, that had not change since MetaPhlAn2. Part of the reason for this (apart for the reason that the database is quite old already), is that we are currently working on a new version that integrates both known and unknown viral species clusters and we hope to release later this year as the new version 4.1