Use humann-downloaded chocophlan with metaphlan

Is it possible to use the humann-downloaded chocophlan for metaphlan? I know it’s possible to go the other way, but I already have it downloaded (and it took like an hour), so would rather not repeat that step. I tried adding

--bowtie2db /path/to/humann/chocophlan --index mpa_v30_CHOCOPhlAn_201901

to metaphlan, but this had an issue with the md5 checksum:

Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

File /babbage/biobakery_databases/humann/chocophlan/mpa_v30_CHOCOPhlAn_201901.tar already present!

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AACTzoUYDqZps8u2JqWCNCODa/mpa_v30_CHOCOPhlAn_201901.md5?dl=1
Downloading file of size: 0.00 MB
MD5 checksums do not correspond! If this happens again, you should remove the database files and rerun MetaPhlAn so they are re-downloaded

Similar to this post, but that discussion went in a bunch of different places, so I thought a new post was warranted.

❯ metaphlan --version
MetaPhlAn version 3.0.2 (23 Jul 2020)
❯ humann --version
humann v3.0.0.alpha.3

HUMAnN’s ChocoPhlAn is the full set of pangenomes (hence the long download). The MetaPhlAn “ChocoPhlAn” is just the marker genes. The marker file has “ChocoPhlAn” in the name since it’s relative to a specific pangenome set. It sounds like you just need to download the MetaPhlAn marker database and not a new HUMAnN database?

1 Like

Ahh, that makes sense. Indeed, downloading the metaphlan one was only ~2 min.

Hmm - I’m now getting

CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( file_list.txt ) that are not of the expected version. Please install the latest version of the database: 201901

from humann. The trick of --metaphlan-options '--index mpa_v30_CHOCOPhlAn_201901' is not working :frowning:

This option should work, I suspect that file_list.txt points to a different URL, I would try to manually delete it and re-download the MetaPhlAn db