Title: Help with HUMAnN 3.9 & MetaPhlAn 4 Profile Incompatibility
Hello BioBakery community,
I am working on functional profiling of human metagenomic samples and have run into a version incompatibility issue between MetaPhlAn 4.0.3 and HUMAnN 3.9. I would appreciate any guidance on the correct workflow.
My Setup:
-
Software: HUMAnN v3.9, MetaPhlAn v4.0.3
-
Environment: HPC server where I cannot update to HUMAnN 4.
-
Input Files: Paired-end FASTQ files (e.g., ALF001_1_unmapped.fastq.gz)
Step 1: Taxonomic Profiling (Successful)
I have already generated taxonomic profiles for all my samples using MetaPhlAn v4.0.3 . I used the latest mpa_vJan25 database (#metaphlan --install --index latest --bowtie2db /pathway/metaphlan_databases), which I downloaded successfully. The database directory (chocophlan_2025_db/) contains these files:
Generated code
mpa_latest
mpa_vJan25_CHOCOPhlAnSGB_202503.md5
mpa_vJan25_CHOCOPhlAnSGB_202503.pkl
mpa_vJan25_CHOCOPhlAnSGB_202503.tar
mpa_vJan25_CHOCOPhlAnSGB_202503.1.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.2.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.3.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.4.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.rev.1.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.rev.2.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503_VINFO.csv
mpa_vJan25_CHOCOPhlAnSGB_202503_VSG.fna
Use code with caution.
Step 2: Functional Profiling (The Problem)
Next, I tried to run HUMAnN v3.9 using the taxonomic profile generated in Step 1.
Here is the command I used:
Generated bash
humann \
--input /pathway/ALF001_1_unmapped.fastq.gz \
-o /pathway/humann_output/ALF001_1 \
--taxonomic-profile /pathway/metaphlan_output/ALF001_1_profile.txt \
--nucleotide-database /pathway/metaphlan_databases/chocophlan_2025_db \
--protein-database /pathway/metaphlan_databases/uniref \
--threads 8 \
--remove-temp-output
Use code with caution.Bash
This command failed with the following error, correctly identifying a database version mismatch:
Generated code
CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( mpa_vJan25_CHOCOPhlAnSGB_202503.1.bt2l ) that are not of the expected version. Please install the latest version of the database: v201901_v31
Use code with caution.
Step 3: Troubleshooting Attempts
Based on the error message, I downloaded the recommended databases for HUMAnN v3.9 using the following commands:
Generated bash
# Download ChocoPhlAn v201901_v31
humann_databases --download chocophlan full /pathway/humann_databases_v3 --update-config yes
# Download UniRef90
humann_databases --download uniref uniref90_diamond /pathway/humann_databases_v3 --update-config yes
# Download Utility Mapping
humann_databases --download utility_mapping full /pathway/humann_databases_v3 --update-config yes
Use code with caution.Bash
I now have the correct v201901_v31 databases in a new directory (/pathway/humann_databases_v3/).
However, my HUMAnN command still fails . It seems the profile from MetaPhlAn 4 is fundamentally incompatible with the older database HUMAnN 3.9 expects. The run also fails if I remove the --taxonomic-profile flag and let HUMAnN attempt its own internal MetaPhlAn run.
My Questions:
-
Is there any way to make a MetaPhlAn 4 profile compatible with HUMAnN 3.9 ? (e.g., a conversion script or special flags).
-
Given that I am restricted to HUMAnN v3.9, is my only viable option to re-run the taxonomic profiling using MetaPhlAn 3.0 ?
-
If I need to use MetaPhlAn 3.0, can you confirm that the correct database would be the mpa_v30_CHOCOPhlAn_201901 version?
-
Is there another solution I am missing to get my samples processed with my current software and database constraints?
Any advice on the recommended workflow would be greatly appreciated. Thank you