Compatibility of metaphlan4.0.3 output with human3.9

Title: Help with HUMAnN 3.9 & MetaPhlAn 4 Profile Incompatibility

Hello BioBakery community,

I am working on functional profiling of human metagenomic samples and have run into a version incompatibility issue between MetaPhlAn 4.0.3 and HUMAnN 3.9. I would appreciate any guidance on the correct workflow.

My Setup:

  • Software: HUMAnN v3.9, MetaPhlAn v4.0.3

  • Environment: HPC server where I cannot update to HUMAnN 4.

  • Input Files: Paired-end FASTQ files (e.g., ALF001_1_unmapped.fastq.gz)

Step 1: Taxonomic Profiling (Successful)

I have already generated taxonomic profiles for all my samples using MetaPhlAn v4.0.3 . I used the latest mpa_vJan25 database (#metaphlan --install --index latest --bowtie2db /pathway/metaphlan_databases), which I downloaded successfully. The database directory (chocophlan_2025_db/) contains these files:

Generated code

mpa_latest
mpa_vJan25_CHOCOPhlAnSGB_202503.md5
mpa_vJan25_CHOCOPhlAnSGB_202503.pkl
mpa_vJan25_CHOCOPhlAnSGB_202503.tar
mpa_vJan25_CHOCOPhlAnSGB_202503.1.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.2.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.3.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.4.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.rev.1.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503.rev.2.bt2l
mpa_vJan25_CHOCOPhlAnSGB_202503_VINFO.csv
mpa_vJan25_CHOCOPhlAnSGB_202503_VSG.fna

Use code with caution.

Step 2: Functional Profiling (The Problem)

Next, I tried to run HUMAnN v3.9 using the taxonomic profile generated in Step 1.

Here is the command I used:

Generated bash

humann \
   --input /pathway/ALF001_1_unmapped.fastq.gz \
   -o /pathway/humann_output/ALF001_1 \
   --taxonomic-profile /pathway/metaphlan_output/ALF001_1_profile.txt \
   --nucleotide-database /pathway/metaphlan_databases/chocophlan_2025_db \
   --protein-database /pathway/metaphlan_databases/uniref \
   --threads 8 \
   --remove-temp-output

Use code with caution.Bash

This command failed with the following error, correctly identifying a database version mismatch:

Generated code

CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( mpa_vJan25_CHOCOPhlAnSGB_202503.1.bt2l ) that are not of the expected version. Please install the latest version of the database: v201901_v31

Use code with caution.

Step 3: Troubleshooting Attempts

Based on the error message, I downloaded the recommended databases for HUMAnN v3.9 using the following commands:

Generated bash

# Download ChocoPhlAn v201901_v31
humann_databases --download chocophlan full /pathway/humann_databases_v3 --update-config yes

# Download UniRef90
humann_databases --download uniref uniref90_diamond /pathway/humann_databases_v3 --update-config yes

# Download Utility Mapping
humann_databases --download utility_mapping full /pathway/humann_databases_v3 --update-config yes

Use code with caution.Bash

I now have the correct v201901_v31 databases in a new directory (/pathway/humann_databases_v3/).

However, my HUMAnN command still fails . It seems the profile from MetaPhlAn 4 is fundamentally incompatible with the older database HUMAnN 3.9 expects. The run also fails if I remove the --taxonomic-profile flag and let HUMAnN attempt its own internal MetaPhlAn run.

My Questions:

  1. Is there any way to make a MetaPhlAn 4 profile compatible with HUMAnN 3.9 ? (e.g., a conversion script or special flags).

  2. Given that I am restricted to HUMAnN v3.9, is my only viable option to re-run the taxonomic profiling using MetaPhlAn 3.0 ?

  3. If I need to use MetaPhlAn 3.0, can you confirm that the correct database would be the mpa_v30_CHOCOPhlAn_201901 version?

  4. Is there another solution I am missing to get my samples processed with my current software and database constraints?

Any advice on the recommended workflow would be greatly appreciated. Thank you

1 Like

I just answered a similar post earlier today, please see this:

You can find information about the “unexpected files” question in previous posts, although it sounds like that has been solved.