MetaPhlAn4 nproc effect

Hello,
I have tested metaphan4 with example files and different --nproc settings (no, 4, 40 threads). I do not see any performance difference here. Is this expected?

The bash script lokked like this (changing the --nproc settings in the tests):
#!bin/bash
date
metaphlan SRS014476-Supragingival_plaque.fasta.gz --input_type fasta --nproc 4 > SRS014476-Supragingival_plaque_profile.txt
date

This are the performance results: There is no difference in the duration of the jobs.
I would appreciate any comments on this.
Best regards,

–nproc 4

Fri Aug 4 09:15:38 AM CEST 2023
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
Fri Aug 4 09:17:36 AM CEST 2023

–nproc 40

Fri Aug 4 09:19:25 AM CEST 2023
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
Fri Aug 4 09:21:19 AM CEST 2023

w/o --nproc setting

Fri Aug 4 09:23:59 AM CEST 2023
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
Fri Aug 4 09:25:58 AM CEST 2023

Short update:
The effect of multi-processors is visible when using larger datasets. It seems that the example sequences are too small that bowtie2 finished within short time.
I have a larger dataset with ca. 700k sequences and all 40 threads are busy!
Sorry for posting this too fast…
Best,

1 Like