Cannot run humann v3.7 using the latest Chocophlan database

In my Institute’s computer cluster I ran metaphlan v4.0.6 using the mpa_vOct22_CHOCOPhlAnSGB_202212 database version (downloaded from http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/) on a collection of samples. Now I want to use the generated taxonomic profiles to run humann v3.7 with the same samples. However, when trying to do this using the same Chocophlan database version that I used with metaphlan, humann produces a critical error telling me that I need to install the latest version of the database: v201901_v31.

Here’s the humann comand line:
humann --taxonomic-profile …/…/metaphlan/${name}.tax_profile.txt --input ${input_dir}/${name}_kneaddata.fastq.gz --output /vast/scratch/users/schulze.a/humann_output/MIT --threads 8 --metaphlan-options ‘–index mpa_vOct22_CHOCOPhlAnSGB_202212’

Here’s humann’s error message:
CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( mpa_latest ) that are not of the expected version. Please install the latest version of the database: v201901_v31

If I use Chocophlan v201901_v31 (downloaded from Index of /humann_data/chocophlan/) in humann I don’t get this error. However because the inputed taxonomic profiles were produced using mpa_vOct22_CHOCOPhlAnSGB_202212, I understand that doing this would be a mistake. Right? I guess I should use the same database version in both, the metaphlan and humann runs?

Also just to be sure, mpa_vOct22_CHOCOPhlAnSGB_202212 is the latest available metaphlan database, right?

Thanks in advance.

Hello, Thank you for the detailed post! Yes you want to use HUMAnN and MetaPhlAn versions that are compatible with respect to their databases. Yes vOct22 is the latest MetaPhlAn database and v201901_v31 is the latest HUMAnN database. If you have the latest HUMAnN v3.7 and MetaPhlAn v4, you should be set using the default databases they download (v201901_v31 and vOct22). Sorry for the confusion with the version naming conventions of the two tools in that it is hard to determine how they are in sync. We tried to add code to HUMAnN so it will alert you if you are using a database that is not in sync with your MetaPhlAn version. Please post if you have other issues or questions.

Thanks!
Lauren

1 Like

Hi Lauren,

Thanks a lot for your message! This clarifies everything. So there wasn’t a problem when I thought there was one.

Yes I agree that it would make it easier if the equivalent metaphlan and humann Chocophlan database versions had similar names. That or perhaps a note somewhere in the humann manual. I was assuming that mpa_vOct22_CHOCOPhlAnSGB_202212 was the latest version and that v201901_v31 wasn’t in part because of the creation or modification dates associated with these in their respective downloading websites.

Can I just ask you one more thing. In the logfiles of the humann runs that did work (because I used the v201901_v31 database) I get a few of this kind of message:

8/22/2023 06:45:54 AM - humann.search.prescreen - DEBUG: Taxon not in mapping file: k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Lachnospiraceae|g__GGB9176|s__GGB9176_SGB14114|t__SGB14114 2|1239|186801|186802|186803||| 2.88808

Should I worry about this? Could this mean that I’m not using the correct mapping file?

Definitely! No you don’t need to worry about that. There are some SGBs that don’t have a direct mapping in HUMAnN v3.7 to MetaPhlAn v4.0.

Thanks!
Lauren

OK, perfect.

Thanks a lot,
Enrique