BioBakery 3 tutorial questions

Hi, we recently installed bioBakery 3 and run some demo datasets from tutorial here (GitHub - biobakery/biobakery_workflows: bioBakery workflows is a collection of workflows and tasks for executing common microbial community analyses using standardized, validated tools and parameters.). I tried to run Whole Metagenome and Metatranscriptome Shotgun (wmgx_wmtx) workflow, but I got some errors similar to this post (error_report:biobakery_workflows wmgx · Issue #22 · biobakery/biobakery · GitHub).

1> I attached the error log in this post. Can you please have a look and see if my issue is the same as the posted issue.

2>If they errors are same, I need to use old version of Kneaddata, if I understand correctly. My current installation is BioBakery 3.0, what is version of Kneaddata in BioBakery 3.0?

Which version of Kneaddata that I should runs the bioBakery tutorials without issues?

3> Do you have latest demo or tutorial dataset that I can run bioBakery workflow without a problem? According to the post here (error_report:biobakery_workflows wmgx · Issue #22 · biobakery/biobakery · GitHub), we have to install old version of Kneaddata to run this wmgx_wmtx workflow.
Is there any way that we could use the latest Kneaddata by modifying the settings/parameters?

Thanks,
Ben



2021-04-15 16:15:49,490 LoggerReporter started INFO: Beginning AnADAMA run with 65 tasks.
2021-04-15 16:15:49,493 LoggerReporter started INFO: Workflow description = A workflow for whole metagenome shotgun sequences
2021-04-15 16:15:49,493 LoggerReporter started INFO: Workflow version = 0.1
2021-04-15 16:15:49,493 LoggerReporter started INFO: Workflow configuration options
2021-04-15 16:15:49,493 LoggerReporter started INFO: input_extension = fastq.gz
2021-04-15 16:15:49,493 LoggerReporter started INFO: barcode_file =
2021-04-15 16:15:49,493 LoggerReporter started INFO: dual_barcode_file =
2021-04-15 16:15:49,493 LoggerReporter started INFO: index_identifier = _I1_001
2021-04-15 16:15:49,494 LoggerReporter started INFO: min_pred_qc_score = 2
2021-04-15 16:15:49,494 LoggerReporter started INFO: threads = 1
2021-04-15 16:15:49,494 LoggerReporter started INFO: pair_identifier = .R1
2021-04-15 16:15:49,494 LoggerReporter started INFO: interleaved = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: bypass_quality_control = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: contaminate_databases = /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database
2021-04-15 16:15:49,494 LoggerReporter started INFO: qc_options =
2021-04-15 16:15:49,494 LoggerReporter started INFO: functional_profiling_options =
2021-04-15 16:15:49,494 LoggerReporter started INFO: remove_intermediate_output = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: bypass_functional_profiling = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: bypass_strain_profiling = True
2021-04-15 16:15:49,494 LoggerReporter started INFO: run_strain_gene_profiling = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: bypass_taxonomic_profiling = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: run_assembly = False
2021-04-15 16:15:49,494 LoggerReporter started INFO: strain_profiling_options =
2021-04-15 16:15:49,494 LoggerReporter started INFO: max_strains = 20
2021-04-15 16:15:49,494 LoggerReporter started INFO: strain_list =
2021-04-15 16:15:49,494 LoggerReporter started INFO: assembly_options =
2021-04-15 16:15:49,494 LoggerReporter started INFO: output = output_data
2021-04-15 16:15:49,495 LoggerReporter started INFO: input = input
2021-04-15 16:15:49,495 LoggerReporter started INFO: config = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: jobs = 1
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid_jobs = 0
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid = aws
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid_partition = general
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid_benchmark = on
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid_options = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid_environment = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: grid_scratch = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: dry_run = False
2021-04-15 16:15:49,495 LoggerReporter started INFO: skip_nothing = False
2021-04-15 16:15:49,495 LoggerReporter started INFO: quit_early = False
2021-04-15 16:15:49,495 LoggerReporter started INFO: until_task = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: exclude_task = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: target = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: exclude_target = None
2021-04-15 16:15:49,495 LoggerReporter started INFO: log_level = INFO
2021-04-15 16:15:50,997 LoggerReporter log_event INFO: task 3, kneaddata____LV20R4_subsample : ready and waiting for resources
2021-04-15 16:15:50,998 LoggerReporter log_event INFO: task 3, kneaddata____LV20R4_subsample : starting to run
2021-04-15 16:15:51,148 LoggerReporter task_command INFO: Tracked executable version: kneaddata v0.7.8.1

2021-04-15 16:15:51,148 LoggerReporter task_command INFO: Executing with shell: kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/LV20R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix LV20R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV20R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV20R4_subsample.fastq
2021-04-15 16:15:51,300 LoggerReporter task_failed ERROR: task 3, kneaddata____LV20R4_subsample : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/usr/local/lib/python3.6/dist-packages/anadama2/helpers.py”, line 89, in actually_sh
ret = _sh(s, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
anadama2.util.ShellException: [Errno 2] Command `kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/LV20R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix LV20R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV20R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV20R4_subsample.fastq’ failed.
Out: b’’
Err: b’usage: kneaddata [-h] [–version] [-v] -i INPUT -o OUTPUT_DIR\n [-db REFERENCE_DB] [–bypass-trim]\n [–output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>]\n [-q {phred33,phred64}] [–run-bmtagger] [–bypass-trf]\n [–run-fastqc-start] [–run-fastqc-end] [–store-temp-output]\n [–remove-intermediate-output] [–cat-final-output]\n [–log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [–log LOG]\n [–trimmomatic TRIMMOMATIC_PATH] [–max-memory MAX_MEMORY]\n [–trimmomatic-options TRIMMOMATIC_OPTIONS]\n [–bypass-trim-repetitive] [–bowtie2 BOWTIE2_PATH]\n [–bowtie2-options BOWTIE2_OPTIONS] [–no-discordant]\n [–reorder] [–serial] [–bmtagger BMTAGGER_PATH]\n [–trf TRF_PATH] [–match MATCH] [–mismatch MISMATCH]\n [–delta DELTA] [–pm PM] [–pi PI] [–minscore MINSCORE]\n [–maxperiod MAXPERIOD] [–fastqc FASTQC_PATH]\nkneaddata: error: unrecognized arguments: --run-trf\n’

2021-04-15 16:15:51,301 LoggerReporter task_failed ERROR: task 16, metaphlan____LV20R4_subsample : Failed! Error message : Task failed because parent task 3' failed 2021-04-15 16:15:51,301 LoggerReporter task_failed ERROR: task 25, humann____LV20R4_subsample : Failed! Error message : Task failed because parent task 16’ failed
2021-04-15 16:15:51,301 LoggerReporter task_failed ERROR: task 32, humann_regroup_UniRef2EC____LV20R4_subsample : Failed! Error message : Task failed because parent task 25' failed 2021-04-15 16:15:51,301 LoggerReporter task_failed ERROR: task 47, humann_renorm_ecs_relab____LV20R4_subsample : Failed! Error message : Task failed because parent task 32’ failed
2021-04-15 16:15:51,301 LoggerReporter task_failed ERROR: task 41, humann_renorm_genes_relab____LV20R4_subsample : Failed! Error message : Task failed because parent task 25' failed 2021-04-15 16:15:51,301 LoggerReporter task_failed ERROR: task 53, humann_renorm_pathways_relab____LV20R4_subsample : Failed! Error message : Task failed because parent task 25’ failed
2021-04-15 16:15:51,302 LoggerReporter log_event INFO: task 5, kneaddata____HD48R4_subsample : ready and waiting for resources
2021-04-15 16:15:51,302 LoggerReporter log_event INFO: task 5, kneaddata____HD48R4_subsample : starting to run
2021-04-15 16:15:51,452 LoggerReporter task_command INFO: Tracked executable version: kneaddata v0.7.8.1

2021-04-15 16:15:51,452 LoggerReporter task_command INFO: Executing with shell: kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/HD48R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix HD48R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD48R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD48R4_subsample.fastq
2021-04-15 16:15:51,603 LoggerReporter task_failed ERROR: task 5, kneaddata____HD48R4_subsample : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/usr/local/lib/python3.6/dist-packages/anadama2/helpers.py”, line 89, in actually_sh
ret = _sh(s, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
anadama2.util.ShellException: [Errno 2] Command `kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/HD48R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix HD48R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD48R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD48R4_subsample.fastq’ failed.
Out: b’’
Err: b’usage: kneaddata [-h] [–version] [-v] -i INPUT -o OUTPUT_DIR\n [-db REFERENCE_DB] [–bypass-trim]\n [–output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>]\n [-q {phred33,phred64}] [–run-bmtagger] [–bypass-trf]\n [–run-fastqc-start] [–run-fastqc-end] [–store-temp-output]\n [–remove-intermediate-output] [–cat-final-output]\n [–log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [–log LOG]\n [–trimmomatic TRIMMOMATIC_PATH] [–max-memory MAX_MEMORY]\n [–trimmomatic-options TRIMMOMATIC_OPTIONS]\n [–bypass-trim-repetitive] [–bowtie2 BOWTIE2_PATH]\n [–bowtie2-options BOWTIE2_OPTIONS] [–no-discordant]\n [–reorder] [–serial] [–bmtagger BMTAGGER_PATH]\n [–trf TRF_PATH] [–match MATCH] [–mismatch MISMATCH]\n [–delta DELTA] [–pm PM] [–pi PI] [–minscore MINSCORE]\n [–maxperiod MAXPERIOD] [–fastqc FASTQC_PATH]\nkneaddata: error: unrecognized arguments: --run-trf\n’

2021-04-15 16:15:51,603 LoggerReporter task_failed ERROR: task 17, metaphlan____HD48R4_subsample : Failed! Error message : Task failed because parent task 5' failed 2021-04-15 16:15:51,603 LoggerReporter task_failed ERROR: task 26, humann____HD48R4_subsample : Failed! Error message : Task failed because parent task 17’ failed
2021-04-15 16:15:51,603 LoggerReporter task_failed ERROR: task 33, humann_regroup_UniRef2EC____HD48R4_subsample : Failed! Error message : Task failed because parent task 26' failed 2021-04-15 16:15:51,603 LoggerReporter task_failed ERROR: task 48, humann_renorm_ecs_relab____HD48R4_subsample : Failed! Error message : Task failed because parent task 33’ failed
2021-04-15 16:15:51,604 LoggerReporter task_failed ERROR: task 42, humann_renorm_genes_relab____HD48R4_subsample : Failed! Error message : Task failed because parent task 26' failed 2021-04-15 16:15:51,604 LoggerReporter task_failed ERROR: task 54, humann_renorm_pathways_relab____HD48R4_subsample : Failed! Error message : Task failed because parent task 26’ failed
2021-04-15 16:15:51,604 LoggerReporter log_event INFO: task 7, kneaddata____HD42R4_subsample : ready and waiting for resources
2021-04-15 16:15:51,604 LoggerReporter log_event INFO: task 7, kneaddata____HD42R4_subsample : starting to run
2021-04-15 16:15:51,752 LoggerReporter task_command INFO: Tracked executable version: kneaddata v0.7.8.1

2021-04-15 16:15:51,752 LoggerReporter task_command INFO: Executing with shell: kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/HD42R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix HD42R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD42R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD42R4_subsample.fastq
2021-04-15 16:15:51,902 LoggerReporter task_failed ERROR: task 7, kneaddata____HD42R4_subsample : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/usr/local/lib/python3.6/dist-packages/anadama2/helpers.py”, line 89, in actually_sh
ret = _sh(s, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
anadama2.util.ShellException: [Errno 2] Command `kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/HD42R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix HD42R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD42R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD42R4_subsample.fastq’ failed.
Out: b’’
Err: b’usage: kneaddata [-h] [–version] [-v] -i INPUT -o OUTPUT_DIR\n [-db REFERENCE_DB] [–bypass-trim]\n [–output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>]\n [-q {phred33,phred64}] [–run-bmtagger] [–bypass-trf]\n [–run-fastqc-start] [–run-fastqc-end] [–store-temp-output]\n [–remove-intermediate-output] [–cat-final-output]\n [–log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [–log LOG]\n [–trimmomatic TRIMMOMATIC_PATH] [–max-memory MAX_MEMORY]\n [–trimmomatic-options TRIMMOMATIC_OPTIONS]\n [–bypass-trim-repetitive] [–bowtie2 BOWTIE2_PATH]\n [–bowtie2-options BOWTIE2_OPTIONS] [–no-discordant]\n [–reorder] [–serial] [–bmtagger BMTAGGER_PATH]\n [–trf TRF_PATH] [–match MATCH] [–mismatch MISMATCH]\n [–delta DELTA] [–pm PM] [–pi PI] [–minscore MINSCORE]\n [–maxperiod MAXPERIOD] [–fastqc FASTQC_PATH]\nkneaddata: error: unrecognized arguments: --run-trf\n’

2021-04-15 16:15:51,903 LoggerReporter task_failed ERROR: task 18, metaphlan____HD42R4_subsample : Failed! Error message : Task failed because parent task 7' failed 2021-04-15 16:15:51,903 LoggerReporter task_failed ERROR: task 27, humann____HD42R4_subsample : Failed! Error message : Task failed because parent task 18’ failed
2021-04-15 16:15:51,903 LoggerReporter task_failed ERROR: task 34, humann_regroup_UniRef2EC____HD42R4_subsample : Failed! Error message : Task failed because parent task 27' failed 2021-04-15 16:15:51,903 LoggerReporter task_failed ERROR: task 49, humann_renorm_ecs_relab____HD42R4_subsample : Failed! Error message : Task failed because parent task 34’ failed
2021-04-15 16:15:51,903 LoggerReporter task_failed ERROR: task 43, humann_renorm_genes_relab____HD42R4_subsample : Failed! Error message : Task failed because parent task 27' failed 2021-04-15 16:15:51,903 LoggerReporter task_failed ERROR: task 55, humann_renorm_pathways_relab____HD42R4_subsample : Failed! Error message : Task failed because parent task 27’ failed
2021-04-15 16:15:51,903 LoggerReporter log_event INFO: task 9, kneaddata____LV16R4_subsample : ready and waiting for resources
2021-04-15 16:15:51,903 LoggerReporter log_event INFO: task 9, kneaddata____LV16R4_subsample : starting to run
2021-04-15 16:15:52,052 LoggerReporter task_command INFO: Tracked executable version: kneaddata v0.7.8.1

2021-04-15 16:15:52,052 LoggerReporter task_command INFO: Executing with shell: kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/LV16R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix LV16R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV16R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV16R4_subsample.fastq
2021-04-15 16:15:52,203 LoggerReporter task_failed ERROR: task 9, kneaddata____LV16R4_subsample : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/usr/local/lib/python3.6/dist-packages/anadama2/helpers.py”, line 89, in actually_sh
ret = _sh(s, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
anadama2.util.ShellException: [Errno 2] Command `kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/LV16R4_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix LV16R4_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV16R4_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LV16R4_subsample.fastq’ failed.
Out: b’’
Err: b’usage: kneaddata [-h] [–version] [-v] -i INPUT -o OUTPUT_DIR\n [-db REFERENCE_DB] [–bypass-trim]\n [–output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>]\n [-q {phred33,phred64}] [–run-bmtagger] [–bypass-trf]\n [–run-fastqc-start] [–run-fastqc-end] [–store-temp-output]\n [–remove-intermediate-output] [–cat-final-output]\n [–log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [–log LOG]\n [–trimmomatic TRIMMOMATIC_PATH] [–max-memory MAX_MEMORY]\n [–trimmomatic-options TRIMMOMATIC_OPTIONS]\n [–bypass-trim-repetitive] [–bowtie2 BOWTIE2_PATH]\n [–bowtie2-options BOWTIE2_OPTIONS] [–no-discordant]\n [–reorder] [–serial] [–bmtagger BMTAGGER_PATH]\n [–trf TRF_PATH] [–match MATCH] [–mismatch MISMATCH]\n [–delta DELTA] [–pm PM] [–pi PI] [–minscore MINSCORE]\n [–maxperiod MAXPERIOD] [–fastqc FASTQC_PATH]\nkneaddata: error: unrecognized arguments: --run-trf\n’

2021-04-15 16:15:52,204 LoggerReporter task_failed ERROR: task 19, metaphlan____LV16R4_subsample : Failed! Error message : Task failed because parent task 9' failed 2021-04-15 16:15:52,204 LoggerReporter task_failed ERROR: task 28, humann____LV16R4_subsample : Failed! Error message : Task failed because parent task 9’ failed
2021-04-15 16:15:52,204 LoggerReporter task_failed ERROR: task 35, humann_regroup_UniRef2EC____LV16R4_subsample : Failed! Error message : Task failed because parent task 28' failed 2021-04-15 16:15:52,204 LoggerReporter task_failed ERROR: task 50, humann_renorm_ecs_relab____LV16R4_subsample : Failed! Error message : Task failed because parent task 35’ failed
2021-04-15 16:15:52,204 LoggerReporter task_failed ERROR: task 44, humann_renorm_genes_relab____LV16R4_subsample : Failed! Error message : Task failed because parent task 28' failed 2021-04-15 16:15:52,204 LoggerReporter task_failed ERROR: task 56, humann_renorm_pathways_relab____LV16R4_subsample : Failed! Error message : Task failed because parent task 28’ failed
2021-04-15 16:15:52,204 LoggerReporter log_event INFO: task 11, kneaddata____HD32R1_subsample : ready and waiting for resources
2021-04-15 16:15:52,204 LoggerReporter log_event INFO: task 11, kneaddata____HD32R1_subsample : starting to run
2021-04-15 16:15:52,356 LoggerReporter task_command INFO: Tracked executable version: kneaddata v0.7.8.1

2021-04-15 16:15:52,356 LoggerReporter task_command INFO: Executing with shell: kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/HD32R1_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix HD32R1_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD32R1_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD32R1_subsample.fastq
2021-04-15 16:15:52,506 LoggerReporter task_failed ERROR: task 11, kneaddata____HD32R1_subsample : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/usr/local/lib/python3.6/dist-packages/anadama2/helpers.py”, line 89, in actually_sh
ret = _sh(s, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
anadama2.util.ShellException: [Errno 2] Command `kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/HD32R1_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix HD32R1_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD32R1_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/HD32R1_subsample.fastq’ failed.
Out: b’’
Err: b’usage: kneaddata [-h] [–version] [-v] -i INPUT -o OUTPUT_DIR\n [-db REFERENCE_DB] [–bypass-trim]\n [–output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>]\n [-q {phred33,phred64}] [–run-bmtagger] [–bypass-trf]\n [–run-fastqc-start] [–run-fastqc-end] [–store-temp-output]\n [–remove-intermediate-output] [–cat-final-output]\n [–log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [–log LOG]\n [–trimmomatic TRIMMOMATIC_PATH] [–max-memory MAX_MEMORY]\n [–trimmomatic-options TRIMMOMATIC_OPTIONS]\n [–bypass-trim-repetitive] [–bowtie2 BOWTIE2_PATH]\n [–bowtie2-options BOWTIE2_OPTIONS] [–no-discordant]\n [–reorder] [–serial] [–bmtagger BMTAGGER_PATH]\n [–trf TRF_PATH] [–match MATCH] [–mismatch MISMATCH]\n [–delta DELTA] [–pm PM] [–pi PI] [–minscore MINSCORE]\n [–maxperiod MAXPERIOD] [–fastqc FASTQC_PATH]\nkneaddata: error: unrecognized arguments: --run-trf\n’

2021-04-15 16:15:52,507 LoggerReporter task_failed ERROR: task 20, metaphlan____HD32R1_subsample : Failed! Error message : Task failed because parent task 11' failed 2021-04-15 16:15:52,507 LoggerReporter task_failed ERROR: task 29, humann____HD32R1_subsample : Failed! Error message : Task failed because parent task 11’ failed
2021-04-15 16:15:52,507 LoggerReporter task_failed ERROR: task 36, humann_regroup_UniRef2EC____HD32R1_subsample : Failed! Error message : Task failed because parent task 29' failed 2021-04-15 16:15:52,507 LoggerReporter task_failed ERROR: task 51, humann_renorm_ecs_relab____HD32R1_subsample : Failed! Error message : Task failed because parent task 36’ failed
2021-04-15 16:15:52,507 LoggerReporter task_failed ERROR: task 45, humann_renorm_genes_relab____HD32R1_subsample : Failed! Error message : Task failed because parent task 29' failed 2021-04-15 16:15:52,507 LoggerReporter task_failed ERROR: task 57, humann_renorm_pathways_relab____HD32R1_subsample : Failed! Error message : Task failed because parent task 29’ failed
2021-04-15 16:15:52,507 LoggerReporter log_event INFO: task 0, kneaddata____LD96R2_subsample : ready and waiting for resources
2021-04-15 16:15:52,508 LoggerReporter log_event INFO: task 0, kneaddata____LD96R2_subsample : starting to run
2021-04-15 16:15:52,661 LoggerReporter task_command INFO: Tracked executable version: kneaddata v0.7.8.1

2021-04-15 16:15:52,661 LoggerReporter task_command INFO: Executing with shell: kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/LD96R2_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix LD96R2_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LD96R2_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LD96R2_subsample.fastq
2021-04-15 16:15:52,812 LoggerReporter task_failed ERROR: task 0, kneaddata____LD96R2_subsample : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/usr/local/lib/python3.6/dist-packages/anadama2/helpers.py”, line 89, in actually_sh
ret = _sh(s, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
anadama2.util.ShellException: [Errno 2] Command `kneaddata --input /work/izardlabhcc/sdpapet/Tutorials/workflows/input/LD96R2_subsample.fastq.gz --output /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main --threads 1 --output-prefix LD96R2_subsample --reference-db /lustre/work/izardlabhcc/sdpapet/Tutorials/workflows/database --serial --run-trf && mv /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LD96R2_subsample.repeats.removed.fastq /work/izardlabhcc/sdpapet/Tutorials/workflows/output_data/kneaddata/main/LD96R2_subsample.fastq’ failed.
Out: b’’
Err: b’usage: kneaddata [-h] [–version] [-v] -i INPUT -o OUTPUT_DIR\n [-db REFERENCE_DB] [–bypass-trim]\n [–output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>]\n [-q {phred33,phred64}] [–run-bmtagger] [–bypass-trf]\n [–run-fastqc-start] [–run-fastqc-end] [–store-temp-output]\n [–remove-intermediate-output] [–cat-final-output]\n [–log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [–log LOG]\n [–trimmomatic TRIMMOMATIC_PATH] [–max-memory MAX_MEMORY]\n [–trimmomatic-options TRIMMOMATIC_OPTIONS]\n [–bypass-trim-repetitive] [–bowtie2 BOWTIE2_PATH]\n [–bowtie2-options BOWTIE2_OPTIONS] [–no-discordant]\n [–reorder] [–serial] [–bmtagger BMTAGGER_PATH]\n [–trf TRF_PATH] [–match MATCH] [–mismatch MISMATCH]\n [–delta DELTA] [–pm PM] [–pi PI] [–minscore MINSCORE]\n [–maxperiod MAXPERIOD] [–fastqc FASTQC_PATH]\nkneaddata: error: unrecognized arguments: --run-trf\n’

2021-04-15 16:15:52,812 LoggerReporter task_failed ERROR: task 13, kneaddata_read_count_table : Failed! Error message : Task failed because parent task 0' failed 2021-04-15 16:15:52,812 LoggerReporter task_failed ERROR: task 14, metaphlan____LD96R2_subsample : Failed! Error message : Task failed because parent task 0’ failed
2021-04-15 16:15:52,812 LoggerReporter task_failed ERROR: task 21, metaphlan_join_taxonomic_profiles : Failed! Error message : Task failed because parent task 14' failed 2021-04-15 16:15:52,812 LoggerReporter task_failed ERROR: task 22, metaphlan_count_species : Failed! Error message : Task failed because parent task 21’ failed
2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 23, humann____LD96R2_subsample : Failed! Error message : Task failed because parent task 0' failed 2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 30, humann_count_alignments_species : Failed! Error message : Task failed because parent task 23’ failed
2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 31, humann_regroup_UniRef2EC____LD96R2_subsample : Failed! Error message : Task failed because parent task 23' failed 2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 38, humann_join_tables_ecs : Failed! Error message : Task failed because parent task 32’ failed
2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 46, humann_renorm_ecs_relab____LD96R2_subsample : Failed! Error message : Task failed because parent task 31' failed 2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 59, humann_join_tables_ecs_relab : Failed! Error message : Task failed because parent task 46’ failed
2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 62, humann_count_features_ecs : Failed! Error message : Task failed because parent task 59' failed 2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 37, humann_join_tables_genefamilies : Failed! Error message : Task failed because parent task 23’ failed
2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 39, humann_join_tables_pathabundance : Failed! Error message : Task failed because parent task `23’ failed
2021-04-15 16:15:52,813 LoggerReporter task_failed ERROR: task 40, humann_renorm_genes_relab____LD96R2_subsample : Failed! Error message : Task failed because

Does anyone know this problem?

Thanks

Can anyone help with this?

Hi Ben, Thank you for the detailed post. If you upgrade to the latest version of Kneaddata v0.10.0 it should resolve the error you are seeing in running the bioBakery workflows.

Sorry for any confusion. With recent changes to Kneaddata one of its flags was removed that is used by the workflows so the two were a bit out of sync. We updated the latest version of Kneaddata so its options are now in sync with the workflows.

Thank you,
Lauren

Hi Lauren,

Thanks for reply. I googled a bit. Some suggestion to revert the Kneaddata version to any version lower than 0.7.8. I tried but it didn’t work.

BTW, can you double check the demo dataset of bioBakery 3.0 tutorials, section 2.1.

The six demo datasets seems very small even if they are subsampled. I just want to make sure I am using the correct downloaded datasets. If they are not correct, can you give me the download links for demo datasets.

Ben

Hi Ben, I just double checked and it looks like the links do point to the current tutorial data sets. They are made to be very small so the workflow run as a tutorial will not take much time. Each file should be approx 2-3 MB. If you find yours are smaller try downloading them again and see if that fixes it.

Thank you,
Lauren

Hi Lauren,

Thanks for the help. I have upgraded the Kneaddata version to v0.10.0, it doesn’t work. I also tried to revert it version lower than v0.7.8 (someone posted the lower version works), but didn’t work either.

I used the example data here (GitHub - biobakery/biobakery_workflows: bioBakery workflows is a collection of workflows and tasks for executing common microbial community analyses using standardized, validated tools and parameters.) and I run metagenome and metatranscriptom workflow together.

`biobakery_workflows wmgx_wmtx --input-metagenome
examples/wmgx_wmtx/wms/ --input-metatranscriptome
examples/wmgx_wmtx/wts/ --input-mapping examples/wmgx_wmtx/mapping.tsv
–output workflow_output

The example data files are very small, which is less than 1 MB. I suppose there is no problems with the examples. I am using HPC and installed bioBakery as image. I don’t think there is much differences.

All I did is like this:(we use singularity docker container. If you think we didn’t install it properly, let me know the best way that we should install)

module load singularity

singularity exec docker://unlhcc/biobakery biobakery_workflows wmgx_wmtx --input-metagenome examples/wmgx_wmtx/wms/ --input-metatranscriptome examples/wmgx_wmtx/wts/ --input-mapping examples/wmgx_wmtx/mapping.tsv --output workflow_output

I attached the full log file in this post, but the errors seem similar to the old errors.

Thanks,
BenError log.txt (24.1 KB)

Hi Ben - Thank you for the detailed post. From looking through the log it looks like the workflow is failing because it can’t find the databases needed for the Kneaddata tasks. We don’t include the databases in the Docker image because they are too large. You would need to first install the databases before running the Docker container. Sorry for any confusion this might have caused. In total the databases are on the order of 10s of Gbs so based on the default size of a Docker container it would not be possible for us to include them in the image.

Thank you,
Lauren