HUMAnN users - I’m making a quick pinned post about MetaPhlAn 4 + HUMAnN 4 compatibility since we’re seeing a lot of posts raising similar issues. We will follow up with more detailed information in the short term, and longer-term we’re also improving our version checking at install and runtime to avoid some of these issues going forward.
Key point 1: The current HUMAnN 4 release (v4.0.0.alpha.1) should work with MetaPhlAn 4 releases up to v4.1.1. It does not support v4.2+, which introduced an API change we still need to adapt to.
Key point 2: All versions of HUMAnN work with specific MetaPhlAn marker databases, since we require compatibility between MetaPhlAn’s markers + taxonomy and HUMAnN’s pangenomes + functional annotations. To use HUMAnN 4.0.0.alpha.1, you should be working with the mpa_vOct22_CHOCOPhlAnSGB_202403 MetaPhlAn marker database. If you install (or update to) a newer marker database it will break HUMAnN 4 compatibility.
If you’re working with older versions of HUMAnN (e.g. v3.9) in MetaPhlAn 4 compatibility mode, please see the release notes for your specific version of HUMAnN for the correct MetaPhlAn software and marker versions.
Apologies to those that have been struggling with HUMAnN installation as a result of the constraints above, and thanks for raising awareness here.
Apologies if this isn’t the right thread for my question — please let me know if I should post it separately.
I’ve recently started working on metatranscriptomics with a focus on the gut microbiota, and I’m currently running HUMAnN v4.0.0.alpha.1 with MetaPhlAn v4.1.1 (11 Mar 2024).
However, I’ve noticed something odd: in my utility_mapping subdirectory, I have the files mpa_vJan21_CHOCOPhlAnSGB_202103.tsv and vOct22_SGB_mapping.tsv, even though my MetaPhlAn database is mpa_vOct22_CHOCOPhlAnSGB_202403.
Did I miss a step during installation or database setup?
Should I manually download the corresponding mpa_vOct22_CHOCOPhlAnSGB_202403.tsv file from somewhere else?
Any guidance or clarification would be greatly appreciated!
I struggled with similar issues described here when trying to get humann4 v4.0.0.alpha.1 running in my compute environment. I eventually got it working and these were the key things that helped me:
specify versions when installing via conda conda create -n humann4 python=3.12 conda activate humann4 conda install humann=4.0.0a1 conda install metaphlan=4.1.1
Download the correct database versions to specific paths metaphlan --install --db_dir metaphlan_databases/vOct22 --index mpa_vOct22_CHOCOPhlAnSGB_202403 humann_databases --download uniref uniref90_ec_filtered_diamond humann4_dbs/ humann_databases --download chocophlan full humann4_dbs/ humann_databases --download utility_mapping full humann4_dbs/
Specify database paths for both humann and metaphlan in the command humann -r -i SAMPLE.fq.gz -o ./SAMPLE/ --threads 16 --protein-database humann4_dbs/uniref --nucleotide-database humann4_dbs/chocophlan --metaphlan-options "-t rel_ab_w_read_stats --bowtie2db metaphlan_databases/vOct22 --index mpa_vOct22_CHOCOPhlAnSGB_202403"
For me it was critical to include -t rel_ab_w_read_stats in the --metaphlan-optionsstring, otherwise metaphlan reverted to the default -t value which causes humann4 to not recognize it as a valid taxonomic profile
Hopefully this will all be outdated when a non-alpha release of Humann4 drops soon!
This might be version-specific behavior or environment-dependent, but I thought it worth mentioning for others who might encounter the same issue.
Also, the root issue I was originally referring to relates to the utility_mapping database: the file mpa_vOct22_CHOCOPhlAnSGB_202403.tsv appears to be missing from the full_mapping_v4_alpha.tar.gz archive. I’ve opened a separate thread to discuss this issue in detail, as it affects compatibility with MetaPhlAn v4.1.1 and the mpa_vOct22_CHOCOPhlAnSGB_202403 database.
Thanks again for documenting your working setup; it’s been very helpful!
Thanks for the detailed explanation. I followed all these steps (with exception of using –bowtie2db instead of –db_dir). However, I get the following error:
CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( chocophlan.v4_alpha.tar.gz ) that are not of the expected version. Please install the latest version of the database: SGB
Has it happened to anyone?
P.S., I had to download all files manually (instead of using –install and –download commands) as our server has blocked downloading from associated links, does it matter?
Thanks for your reply @fmerinocasallo! I use chocophlan.v4_alpha.tar.gz for ChocoPhlAn and mpa_vOct22_CHOCOPhlAnSGB_202403 for index. Am I using anything wrong?
Thanks for your help @fmerinocasallo ! Would you please specify what I need to include in the ChocoPhlAn directory then? here is what I have included in each of the directories:
In my utility_mapping subdirectory, I see the mpa_vJan21_CHOCOPhlAnSGB_202103.tsv file instead of mpa_vOct22_CHOCOPhlAnSGB_202403.tsv, which I’d expected from the vOct22 CHOCOPhlAn SGB database. I’ve raised that potential issue in a separate thread with the maintainers, but there’s been no confirmation or fix so far.
Do you also have this mpa_vJan21_CHOCOPhlAnSGB_202103.tsv file in your utility_mapping subdirectory, or does your setup match the vOct22 naming instead?
Thank you so much @fmerinocasallo for detailed and prompt response! Can’t express my appreciation enough! With your help, I could solve the error I was receiving. Now I get the following error:
It seems that you do not have Internet access.\nERROR: Cannot find a local database. Please run MetaPhlAn using option “-x <database_name>”
Is it because our server is blocking biobakery website? Do we need internet access for running humann even if we have downloaded all the databases?
These options let you manually provide the locations of the ChocoPhlAn and UniRef databases.
Note: I’m not a heavy HUMAnN user, so please treat this advice with a grain of salt. There might be inaccuracies or things I’m overlooking. Still, I hope it helps point you in the right direction!
Admin note: This discussion may be diverging from the original topic. @sagunmaharjann@franzosa : could you please consider moving these posts into a new thread to keep things organized per the forum guidelines? Thank you!
Thanks and apologies if it is not related to this topic. Still I believe it relates to databases…
I had updated the databases and “$ humann_config --print” points me the right directory which has files that you guided me to. However, here is the output of what you asked for: