Take a big memory usage and less rate of CPUs in humann3

Hello everyone,

I run Humann3 found the ‘Running metaphlan …’ step takes a very long time.

The command I used is:
humann --input-format fastq --input XXX --output XXX
I noticed the default setting for ‘–memory-use’ is minimum, but it still take a very big memory.

I check the system monitor, it showed Process bowtie2-align-l take 18GB memory, and humann only take 2GB, and bowtie2-build-s takes 2GB, diamond takes 18GB. Also, only 5~6 CPUs (less than 20%).

My computer was Ubuntu 22.04.3 LTS, have 32GiB memory and 24 CPUs.
I successfully run only one sample .But I cannot run ‘–threads’ option for multiple samples, because it will crushed during the bowtie2-align-l step. As I run --threads 2, the process was been killed : (

I have around 300 shotgun sequencing samples and plan to run then in parallel, any ideas about reduce the memory use for bowtie2-align-l step? I hope to use -threads option to run many samples together.

The input size of each rawdata is around 20GB, is that the reason?

humann v3.8

MetaPhlAn version 4.0.6 (1 Mar 2023)

bowtie version:
/usr/bin/bowtie-align-s version 1.3.1
Built on Debian-reproducible
Tue, 14 Sep 2021 07:01:35 +0200
Compiler: gcc version 11.2.0
Options: -O3 -Wl,–hash-style=both -DPOPCNT_CAPABILITY -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -ffile-prefix-map=.=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -g -O2 -ffile-prefix-map=.=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -std=c++03 -Wl,-Bsymbolic-functions -flto=auto -Wl,-z,relro -Wl,-z,now
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

diamond version 2.1.8

Thanks for your suggestion in advance.

Chris M

Usually the alignment during MetaPhlAn is not a rate-limiting step. Note that the very first time you run MetaPhlAn is has to index the marker database, which is slow. I recommend doing that in the context of a single test run before scaling up to many samples so that you know that step is good to go.

Thanks for the explanation @franzosa ,

I noticed running MetaPhlAn in humann3 is not so time consuming (perhaps take 2 hours to get the ‘bugs_list.tsv’ file).
But ‘Running diamond…’ step seems required a big memory (16GiB RAM) and long time (will take 16 hours based on the UniRef90 database).

In case the size of my input sample is quite big (~20GB) and I want to use the ‘–threads’ option to run in parallel. The only solution is to upgrade my computer’s performance, such as get more RAM? Maybe there is no way to reduce the requirement for the RAM in diamond.

Thank you for the suggestion.


Sorry for the delayed reply here. There ARE options to tune memory use in DIAMOND (-b and -c if memory serves). You can tune these and pass them to HUMAnN via the --diamond-options flag. I don’t have much experience with them as we’ve usually relied on the default values.