Question on Running HUMAnN4 with Limited Memory on HPC

Ivy-ops · July 22, 2025, 5:32am

Hi HUMAnN Developers,

I’m trying to run HUMAnN4 on my metagenomic reads following the tutorial provided here:

I strictly followed the instructions for downloading the database and installing HUMAnN4.

However, when I executed the following code using 4 threads with 6GB memory per thread, it failed with the following error:

Loading database information...Failed attempt to allocate 308038999768bytes;
you may not have enough free memory to load this database.
If your computer has enough RAM, perhaps reducing memory usage from
other programs could help you load this database?
classify: unable to allocate hash table memory

Code I used based on suggestion here https://forum.biobakery.org/t/humann-4-cant-work-with-metaphlan-4/8201/2:

> conda activate biobakery4 
> metaphlan sample.fq.gz \
>     --input_type fastq \
>     -x mpa_vOct22_CHOCOPhlAnSGB_202403 \
>     -t rel_ab_w_read_stats \
>     -o sample_rel_ab_w_read_stats.tsv
> humann -i sample.fq.gz \
>     --threads 4 \
>     --taxonomic-profile sample_rel_ab_w_read_stats.tsv \
>     --metaphlan-options "--input_type fastq -x mpa_vOct22_CHOCOPhlAnSGB_202403 -t rel_ab_w_read_stats" \
>     -o sample

I’m running this on a shared HPC environment but unfortunately the memory and threads are very limited. I would appreciate your advice on the following:

Is there any way to reduce memory usage?
Do you recommend any alternative approaches when working in memory-constrained environments?

Thank you very much for your help!
Ivy

franzosa · July 23, 2025, 7:40pm

That’s a tough one - even the MetaPhlAn marker database now occupies a pretty big memory footprint as our understanding of the microbial universe has expanded. For what it’s worth, a recent benchmark of MetaPhlAn + HUMAnN 4 required 25 GB of RAM (MaxRSS), so it’s possible your 4x6 was JUST shy of enough?

Ivy-ops · July 23, 2025, 10:23pm

Thank you for the quick suggestion — it’s now working with 8 threads and 6 GB each.

I have one further question which is related to database: I tried using the Struo2-released HUMAnN3 database from GTDB release 207, which is around 180 GB for UniRef90. However, I also downloaded the database from the official HUMAnN3 tutorial, and that one appears to be smaller in size.

As someone fairly new to this, would you recommend using the official HUMAnN3 database or the GTDB r207 version from Struo2? Will the results differ significantly?

If my taxonomic profiling was done using GTDB 207, should I also use the GTDB 207 database for HUMAnN to ensure consistency? Conversely, if I’m using MetaPhlAn for taxonomic assignment, should I stick with the database recommended in the official HUMAnN tutorial?

franzosa · August 1, 2025, 8:22pm

From what I’ve seen of Struo(2) it seems very reasonable and useful, but I don’t have any hands-on experience with it from which to offer an informed comparison. That also means that we’ll be very limited in what sort of tech support we can provide here if problems arise using non-bioBakery databases.

Topic		Replies	Views
Why a particular file aborts running in the middle of the HUMAnN analysis? HUMAnN	16	659	September 11, 2020
Take a big memory usage and less rate of CPUs in humann3 HUMAnN	3	511	February 28, 2024
Humann program running was killed in the middle way HUMAnN	5	697	August 6, 2020
Humann3 computation speed HUMAnN	1	2067	September 29, 2020
About memory usage in HuManN 3.0.0.alpha.3 HUMAnN	1	751	September 23, 2022

Question on Running HUMAnN4 with Limited Memory on HPC

Related topics