Grid Jobs Benchmarking

I’m testing out running grid-jobs with biobakery_workflows on a computing cluster and have found that bench marking each step is really slowing down the analysis. Right now I’m testing out running grid jobs with one of the demo samples and have found that benchmarking each step is taking anywhere from 1~10 minutes before the next job is submitted. I’m mostly concerned that if I ran a big batch of samples the bench marking would balloon the analysis time. Is this typically the case, or could there be some issues on my end?

  • Currently I’m using biobakery_workflows 3.1 and anadama2 0.10.0
sbatch -c16  biobakery_workflows wmgx
 --input ./test_files --output ./biobake_output 
--functional-profiling-option='--bypass-translated-search' 
--threads 32 
--taxonomic-profiling-options "--bowtie2db 
miniconda3/envs/biobake/lib/python3.10/site-
packages/metaphlan/metaphlan_databases/ 
--index mpa_vJun23_CHOCOPhlAnSGB_202403" 
--grid slurm
 --grid-jobs 2 --grid-partition node
 --bypass-strain-profiling

Heres an example from the log:

2025-10-03 17:50:43,657 root
log_grid_output INFO: Grid 16 from task id return code:0

2025-10-03 17:50:43,658 LoggerReporter  log_event       
INFO: task 16, humann_renorm_ecs_relab____HD32R1_subsample.gz :  grid job id 4836412 has status Getting benchmarking data

2025-10-03 17:59:43,499 root    get_queue_status       
INFO: Getting latest queue info to refresh job status
2025-10-03 17:59:43,566 root    record_benchmark        
INFO: Benchmark information for job id 16: