For phylophlan v3.0.3, it appears that phylophlan_write_config_file
does not have a --threads
parameter, so --threads
set for the diamond
jobs in the output config file are all just set to --threads 1
. An example:
[map_aa]
program_name = /usr/local/bin/diamond
params = blastp --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0
Moreover, it appears that phylophlan
will use the --threads
set via the config file (so --threads 1
) even if phylophlan --nproc
is set to a greater number of threads.
If this is indeed the case, it would be helpful to include a --threads
parameter for phylophlan_write_config_file
which sets the threads for all multi-threaded jobs specified by the config.
Hi there! That’s wanted by design. I believe that there is (almost) never a linear speed-up when using multi-threading and the number of threads specified. So, for the sub-jobs within PhyloPhlAn I prefer running --nproc
of them each with a single thread than running them sequentially one after the other using the --nproc
number of threads. That’s why the config specifies using only 1 thread. If you look at the RAxML definition in the config (for either [tree1]
or [tree2]
), in that case, you’ll see that the number of threads is not set and will be set by PhyloPhlAn using the --nproc
parameter specified by the user (when the multi-threading version of RAxML is found in the system).
I hope this clarifies the issue.
Many thanks,
Francesco
Thanks for helping to clarify! So --nproc
is per-genome, correct? I could then see the multiplication issue of --nproc x --threads
(e.g., threads for blastp
), which could overload the compute resources. Assuming one has many genomes, parallelizing at the level of genome instead of intra-genome (e.g., multi-threaded blastp per genome) is probably best.
Yes, --nproc
is per-genome when mapping, extracting, selecting, aligning, and trimming markers (also for single gene trees reconstruction, if someone uses that pipeline). Then it is passed on to jobs (like RAxML and IQTREE) that cannot be parallelized per genome, so to exploit intra-multi-threading.