Humann2 computation speed up

Greetings,

We have a few large metagenomics data up to ~ 20 GB. Such data takes more than 3 days to run on AWS cloud instance (34 threads, 70GB memory used). Is there’s any way to speed up the computing? For example, splitting the input data into many small files and distributed them into multiple nodes?

Regards,

-Han

Sorry for the delay. That does feel very slow - can you clarify what options you’re using for the run? The simplest speedup in HUMAnN 2.0 (that doesn’t alter the final results) is to use more threads, which will speed-up the various mapping steps (esp. translated search). Another options is --bypass-translated-search, but this is only advisable if you’re working with a very well-studied community (with lots of reference genomes / pangenome coverage).

I don’t think splitting up the input file would be any more effective than working with multiple threads, and there are some steps that wouldn’t translate well with this approach (e.g. coverage filtering, which benefits from “seeing” all the sample reads in the same run).

Hi Eric,

Thanks for the suggestion! It turns out the slow speed was caused by weired docker problem … The database of humann2 was wrapped into docker image with multiple layers. We changed the way of managing docker and database, and the speed problem was solved.

We frequently process hundreds of metagenomics samples on the cloud, and we want to scale up and down whenever possible. I noticed that during the humann2 processing, the CPU utilization becomes very low for a while and then boosts. This suggests that certain steps of humann2 can be computed with different cloud instances. Do you have any suggestion of the modules (e.g. single thread steps) we can look at?

Thanks,

-Han

Looking at the timestamps from a HUMAnN log file, the starred ones are parallelized (while the others are not):

prescreen*
custom database creation
database index
nucleotide alignment*
nucleotide alignment post-processing
translated alignment*
translated alignment post-processing
computing gene families
computing pathways*

Of these, translated alignment is by far the bottleneck in the process, followed by translated alignment post-processing.