Tutorial not up to date and unable to run successfully

Hi there,

I’ve been trying to get HUMAnN running on my computers, but I’ve encountered several issues across different operating systems (all on Intel-based CPUs).

Initially, I followed the HUMAnN 3.0 tutorial on macOS. While I was able to complete all the steps, I did not obtain the expected output. Specifically, none of my reads were being mapped, which led me to suspect that I might be missing a required MetaPhlAn database. Before investigating that further, I decided to try running the pipeline on Linux. However, on Linux, the job is terminated with a generic “Killed” message. After checking the log files, I was unable to find any obvious or informative errors.

While reviewing the documentation for both HUMAnN and MetaPhlAn, I noticed that newer versions (4.0) are available. I then attempted to follow the HUMAnN 4.0 tutorials, but ran into several issues:

  • The HUMAnN 4.0 tutorial appears to be out of sync with the current pip installation process. By default, pip installs version 3.9, since versions 4.0.0a1 and 4.0.0a2 are marked as pre-releases. To install HUMAnN 4.0, users must explicitly specify the version (e.g., pip install humann==4.0.0a2).

  • Installing via conda does retrieve a 4.0 version of HUMAnN.

  • Attempting to download the demo databases results in the error:
    ERROR: Please select an available build.
    The only workaround I found was to replace the demo option with full in the --download command. However, this forces users to download much larger datasets instead of the intended smaller demo files.

  • As a workaround, downgrading to HUMAnN 3.9 allows the demo databases to download correctly, and these are significantly smaller.

  • When running HUMAnN 4.0, I cannot get past the MetaPhlAn database version check. It consistently expects vOct22_CHOCOPhlAnSGB_202403. This appears to be a required (possibly hard-coded) database version, but the only CHOCOPhlAn database I’ve been able to download successfully is mpa_v31_CHOCOPhlAn_201901.

Assuming HUMAnN 4.0 depends on the latest MetaPhlAn version, I installed MetaPhlAn v4.2.4 along with its database. (Notably, installing HUMAnN via conda pulls MetaPhlAn v3.1 by default, so I manually upgraded it.) However, running HUMAnN resulted in the following MetaPhlAn error:

unrecognized arguments: --bowtie2out demo_humann_temp/demo_metaphlan_bowtie2.txt

I then downgraded MetaPhlAn to v3.1 and installed the corresponding database. After rerunning HUMAnN, I encountered additional errors (not clearly explained in the logs) that still prevented completion.

Since I was unable to get HUMAnN 4.0 working reliably, I attempted to set up a stable configuration using HUMAnN 3.9 with MetaPhlAn 3.1. On Linux, I encountered the error:

MD5 checksums not found, something went wrong!

On macOS, I encountered a different issue where the pipeline runs but produces no mapped reads, similar to my initial attempt.

At this point, I’m unsure whether the issues are due to:

  • mismatched HUMAnN and MetaPhlAn versions,

  • incorrect or incomplete database installations,

  • or inconsistencies between the tutorials and current releases.

Could you please clarify:

  1. Which versions of HUMAnN and MetaPhlAn are currently recommended to use together?

  2. Whether the 4.0 tutorials are up to date, particularly regarding installation and database downloads?

  3. Whether there are known issues with demo database downloads or the --bowtie2out argument error?

Any guidance would be greatly appreciated. Thanks in advance for your help!