MetaPhlAn version 4.0.4 (17 Jan 2023)
Hi friends,
I would like to gain some perspective on the normal run time of MetaPhlAn4 as I am am concerned it is taking much longer than it should.
For reference I have shotgun sequencing data in the form of paired end fastq.gz files. Each single fastq.gz file is on average 3GB.
I am running MetaPhlAn4 on each paired sample, providing 25GB memory and 6 cores for each sample. This is an example of what my command looks like for running one of these samples…
metaphlan path/to/sampleX_R1.fastq.gz, path/to/sampleX_R2.fastq.gz --bowtie2db /path/to/metaphlan4_database --bowtie2out path/to/sampleX_metagenome.bowtie2.bz2 --nproc 6 --input_type fastq -o /path/to/sampleX_profiled_metagenome.txt
I am finding that it is taking around 2 hours for a single sample to be processed. This seems like a long time. It’s a concern as I have a couple thousand samples to process on a 32core 128GB machine so it will take almost 3 weeks for metaphlan to run on all of them.
If anyone has experience running MetaPhlAn4 I would appreciate your input on if what I am seeing is normal.
Thank you!