The bioBakery help forum

How to correctly run humann2 with 'resume'option

Hi there,

I am wondering if you could instruct how to correctly ‘resume’ failed humann2 runs? I have a bunch of samples that failed on walltime (10 hrs, 12 CPU, 48 GB RAM). I resubmitted these with the ‘’–resume’ flag, but the run seems to be re-computing everything. For the example below, the job died while creating the ‘diamond_aligned.tsv’ file, so I expected humann2 to detect that all the bowtie2 steps had been computed and resume from diamond. Please advise how I can resume this job correctly?

Sample directory after failed run:

122M May 16 4:28 T1051B_9022006_D1_D2.targetReads.interleaved_metaphlan_bowtie2.txt
35K May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_metaphlan_bugs_list.tsv
526M May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_custom_chocophlan_database.ffn
113M May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.4.bt2
4.6M May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.3.bt2
113M May 16 4:36 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.2.bt2
230M May 16 4:36 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.1.bt2
113M May 16 4:43 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.rev.2.bt2
230M May 16 4:43 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.rev.1.bt2
36G May 16 5:08 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_aligned.sam
8.4G May 16 5:43 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_unaligned.fa
8.4G May 16 5:43 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_aligned.tsv
4.0K May 16 11:40 tmp7aswifh0
59G May 16 12:38 T1051B_9022006_D1_D2.targetReads.interleaved_diamond_aligned.tsv

Sample directory after resuming failed run, the bowtie 2 aligned and unaligned files are being re-created:

122M May 16 4:28 T1051B_9022006_D1_D2.targetReads.interleaved_metaphlan_bowtie2.txt
35K May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_metaphlan_bugs_list.tsv
526M May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_custom_chocophlan_database.ffn
113M May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.4.bt2
4.6M May 16 4:29 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.3.bt2
113M May 16 4:36 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.2.bt2
230M May 16 4:36 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.1.bt2
113M May 16 4:43 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.rev.2.bt2
230M May 16 4:43 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_index.rev.1.bt2
36G May 16 5:08 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_aligned.sam
4.0K May 16 11:40 tmp7aswifh0
59G May 16 12:38 T1051B_9022006_D1_D2.targetReads.interleaved_diamond_aligned.tsv
18K May 17 9:34 T1051B_9022006_D1_D2.targetReads.interleaved.log
4.0K May 17 9:34 tmp0082z62t
3.8G May 17 9:49 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_unaligned.fa
3.9G May 17 9:49 T1051B_9022006_D1_D2.targetReads.interleaved_bowtie2_aligned.tsv

Job command:

humann2 --threads {NCPUS} \ --input {in}/{fastq} \ --output {out}
–metaphlan-options “–bowtie2db /usr/local/metaphlan2/2.6.0/db_v20”
–protein-database /project/1062911_114738_RDS/humann2_databases/uniref90_diamond
–nucleotide-database /project/1062911_114738_RDS/humann2_databases/chocophlan
–resume

Many thanks,
Cali

Hi Cali, Thank you for the detailed post and sorry for any confusion about the resume option. Looking at your files and the timestamps it does appear the resume option is working as expected. The resume option does not create the bowtie2 database again (see the files with the index as the end of the name with the timestamps that do not change) and it does not run bowtie2 again (see the file named “bowtie2_aligned.sam”). However, the software will read through and reprocess the bowtie2 results (just incase you had changed any of your input parameters for filtering) and with this process it will write new aligned/unaligned files and then proceed to run diamond. So I think everything is working as expected with your runs with the resume option in that you should be seeing the run bypass building the custom bowtie2 database and also bypass running bowtie2 and then proceed to run diamond.

Thank you,
Lauren