Hi all! I am trying to run MetaPhlan 4 by adding parameters: --add_viruses and --mpa3 to allow the profiling of viral organisms, and then it returns the error: No MetaPhlAn BowTie2 database found (–index option)! But if I remove these two parameters, it runs smoothly. Here I attach my command, the error, and the bowtie2 database I used.
Hi @Nirvana
Currently, the metaphlan4 vJan21 database do not profile viral species. For using the --add_viruses parameter (together with the --mpa3 parameter) you should use the metaphlan 3.0 (or 3.1) database available here: Index of /biobakery3/metaphlan_databases
Is this still the case with Metaphlan 4.0.5? And if not, what is the timeline for adding them back to the database? The removal of virus sequences was omitted from the changelog and the Metaphlan4 announcement. The manuscript implies that they are present but not extensively represented, rather than entirely absent:
The current methods also do not extensively incorporate viral or eukaryotic microbial sequences, due to their unique genomic architectures and quality control requirements relative to bacterial and archaeal genomes.
I now need to downgrade and reprocess several hundred samples.
Hi @nickp60
We apologise for the inconvenience. In (any) version 4, we had decided not to include the previous viral markers, that had not change since MetaPhlAn2. Part of the reason for this (apart for the reason that the database is quite old already), is that we are currently working on a new version that integrates both known and unknown viral species clusters and we hope to release later this year as the new version 4.1