Humann3 computation speed

Hi, i have 96 interleaved fasta files of approx 9.5 GB each (Chicken fecal metagenome).

My server has 1Tb memory and 48 cpus with 12 core per socket. I have already run metaphlan.

The command i am running is

humann3 --input input_interleaved.fa --taxonomic-profile metaphlan_output.txt --output output_name --nucleotide-database path/to/chocophlan --protein-database path/to/uniref --threads 48

(i have both uniref50 and uniref90 in the database folder).

each sample is taking more than 10 days to complete. Is there a way to speed this up. or am i doing something wrong here. Running multiple commands parallel using python scripts doesn’t help either.

Metaphlan was run with CHOCOPhlan_201901 database

I am unable to figure it out.

Deeply appreciate any help


Hi Kris, With files of that size I would expect it is possible that HUMAnN could take a couple days with 8 cores for each sample but 10 days does seem long. If both of the uniref databases are in the same folder it is possible HUMAnN is using both for the run which would increase the run time. If you would check the log for one of the runs to see if this is the case and if so move one of the databases to another folder to only use one for each run. I would recommend just running with uniref90. HUMAnN by default uses all of the databases in the folder provided to allow for alignment methods or use cases where a database is split into multiple files.

Thank you,