HUMAnN users - I’m making a quick pinned post about MetaPhlAn 4 + HUMAnN 4 compatibility since we’re seeing a lot of posts raising similar issues. We will follow up with more detailed information in the short term, and longer-term we’re also improving our version checking at install and runtime to avoid some of these issues going forward.
Key point 1: The current HUMAnN 4 release (v4.0.0.alpha.1) should work with MetaPhlAn 4 releases up to v4.1.1. It does not support v4.2+, which introduced an API change we still need to adapt to.
Key point 2: All versions of HUMAnN work with specific MetaPhlAn marker databases, since we require compatibility between MetaPhlAn’s markers + taxonomy and HUMAnN’s pangenomes + functional annotations. To use HUMAnN 4.0.0.alpha.1, you should be working with the mpa_vOct22_CHOCOPhlAnSGB_202403 MetaPhlAn marker database. If you install (or update to) a newer marker database it will break HUMAnN 4 compatibility.
If you’re working with older versions of HUMAnN (e.g. v3.9) in MetaPhlAn 4 compatibility mode, please see the release notes for your specific version of HUMAnN for the correct MetaPhlAn software and marker versions.
Apologies to those that have been struggling with HUMAnN installation as a result of the constraints above, and thanks for raising awareness here.
Apologies if this isn’t the right thread for my question — please let me know if I should post it separately.
I’ve recently started working on metatranscriptomics with a focus on the gut microbiota, and I’m currently running HUMAnN v4.0.0.alpha.1 with MetaPhlAn v4.1.1 (11 Mar 2024).
However, I’ve noticed something odd: in my utility_mapping subdirectory, I have the files mpa_vJan21_CHOCOPhlAnSGB_202103.tsv and vOct22_SGB_mapping.tsv, even though my MetaPhlAn database is mpa_vOct22_CHOCOPhlAnSGB_202403.
Did I miss a step during installation or database setup?
Should I manually download the corresponding mpa_vOct22_CHOCOPhlAnSGB_202403.tsv file from somewhere else?
Any guidance or clarification would be greatly appreciated!
I struggled with similar issues described here when trying to get humann4 v4.0.0.alpha.1 running in my compute environment. I eventually got it working and these were the key things that helped me:
specify versions when installing via conda conda create -n humann4 python=3.12 conda activate humann4 conda install humann=4.0.0a1 conda install metaphlan=4.1.1
Download the correct database versions to specific paths metaphlan --install --db_dir metaphlan_databases/vOct22 --index mpa_vOct22_CHOCOPhlAnSGB_202403 humann_databases --download uniref uniref90_ec_filtered_diamond humann4_dbs/ humann_databases --download chocophlan full humann4_dbs/ humann_databases --download utility_mapping full humann4_dbs/
Specify database paths for both humann and metaphlan in the command humann -r -i SAMPLE.fq.gz -o ./SAMPLE/ --threads 16 --protein-database humann4_dbs/uniref --nucleotide-database humann4_dbs/chocophlan --metaphlan-options "-t rel_ab_w_read_stats --bowtie2db metaphlan_databases/vOct22 --index mpa_vOct22_CHOCOPhlAnSGB_202403"
For me it was critical to include -t rel_ab_w_read_stats in the --metaphlan-optionsstring, otherwise metaphlan reverted to the default -t value which causes humann4 to not recognize it as a valid taxonomic profile
Hopefully this will all be outdated when a non-alpha release of Humann4 drops soon!
This might be version-specific behavior or environment-dependent, but I thought it worth mentioning for others who might encounter the same issue.
Also, the root issue I was originally referring to relates to the utility_mapping database: the file mpa_vOct22_CHOCOPhlAnSGB_202403.tsv appears to be missing from the full_mapping_v4_alpha.tar.gz archive. I’ve opened a separate thread to discuss this issue in detail, as it affects compatibility with MetaPhlAn v4.1.1 and the mpa_vOct22_CHOCOPhlAnSGB_202403 database.
Thanks again for documenting your working setup; it’s been very helpful!