Humann3 computation speed

srikanthkris · September 29, 2020, 12:13am

Hi, i have 96 interleaved fasta files of approx 9.5 GB each (Chicken fecal metagenome).

My server has 1Tb memory and 48 cpus with 12 core per socket. I have already run metaphlan.

The command i am running is

humann3 --input input_interleaved.fa --taxonomic-profile metaphlan_output.txt --output output_name --nucleotide-database path/to/chocophlan --protein-database path/to/uniref --threads 48

(i have both uniref50 and uniref90 in the database folder).

each sample is taking more than 10 days to complete. Is there a way to speed this up. or am i doing something wrong here. Running multiple commands parallel using python scripts doesn’t help either.

Metaphlan was run with CHOCOPhlan_201901 database

I am unable to figure it out.

Deeply appreciate any help

regards
Kris

lauren.j.mciver · September 29, 2020, 7:48pm

Hi Kris, With files of that size I would expect it is possible that HUMAnN could take a couple days with 8 cores for each sample but 10 days does seem long. If both of the uniref databases are in the same folder it is possible HUMAnN is using both for the run which would increase the run time. If you would check the log for one of the runs to see if this is the case and if so move one of the databases to another folder to only use one for each run. I would recommend just running with uniref90. HUMAnN by default uses all of the databases in the folder provided to allow for alignment methods or use cases where a database is split into multiple files.

Thank you,
Lauren

Topic		Replies	Views
Speed up humann3 Data resource	1	738	June 26, 2020
Optimising Humann run time - low species number - uniref database question HUMAnN	2	886	February 11, 2022
Time of analysis questions HUMAnN	2	684	December 11, 2021
Query regarding HUMAnN2 HUMAnN	2	637	April 6, 2020
HUMAnN become slow HUMAnN	4	266	September 6, 2023

Humann3 computation speed

Related topics