Hello.
I am trying to reinstall biobakery workflows with this code:
conda create --name biobakery
conda activate biobakery
conda install -c biobakery biobakery_workflows
conda install -c biobakery leveldb ### This did not solved leveldb dependency lack
pip install leveldb ### This did
biobakery_workflows wmgx --help
The last code generated next output:
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 1
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 2
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 3
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 4
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 5
sinfo: error: get_addr_info: getaddrinfo() fAiled: Temporary failure in name resolution
sinfo: error: slurm_set_addr: Unable to resolve "localcluster"
sinfo: error: Unable to establish control machine address
slurm_load_partitions: Resource temporarily unavailable
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 1
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 2
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 3
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 4
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 5
sinfo: error: get_addr_info: getaddrinfo() fAiled: Temporary failure in name resolution
sinfo: error: slurm_set_addr: Unable to resolve "localcluster"
sinfo: error: Unable to establish control machine address
slurm_load_partitions: Resource temporarily unavailable
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 1
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 2
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 3
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 4
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 5
sinfo: error: get_addr_info: getaddrinfo() fAiled: Temporary failure in name resolution
sinfo: error: slurm_set_addr: Unable to resolve "localcluster"
sinfo: error: Unable to establish control machine address
slurm_load_partitions: Resource temporarily unavailable
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 1
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 2
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 3
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 4
sinfo: error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution: Resource temporarily unavailable, attempt number 5
sinfo: error: get_addr_info: getaddrinfo() fAiled: Temporary failure in name resolution
sinfo: error: slurm_set_addr: Unable to resolve "localcluster"
sinfo: error: Unable to establish control machine address
slurm_load_partitions: Resource temporarily unavailable
usage: wmgx.py [-h] [--version]
[--input-extension {fastq.gz,fastq,fq.gz,fq,fasta,fasta.gz,fastq.bz2,fq.bz2,bam}]
[--barcode-file BARCODE_FILE]
[--dual-barcode-file DUAL_BARCODE_FILE]
[--index-identifier INDEX_IDENTIFIER]
[--min-pred-qc-score MIN_PRED_QC_SCORE] [--threads THREADS]
[--pair-identifier PAIR_IDENTIFIER] [--interleaved]
[--bypass-quality-control]
[--contaminate-databases CONTAMINATE_DATABASES]
[--qc-options QC_OPTIONS] [--qc-scratch QC_SCRATCH]
[--functional-profiling-options FUNCTIONAL_PROFILING_OPTIONS]
[--remove-intermediate-output] [--bypass-functional-profiling]
[--bypass-strain-profiling] [--run-strain-gene-profiling]
[--bypass-taxonomic-profiling] [--run-assembly]
[--strain-profiling-options STRAIN_PROFILING_OPTIONS]
[--taxonomic-profiling-options TAXONOMIC_PROFILING_OPTIONS]
[--max-strains MAX_STRAINS] [--strain-list STRAIN_LIST]
[--assembly-options ASSEMBLY_OPTIONS] -o OUTPUT [-i INPUT]
[--config CONFIG] [--local-jobs JOBS] [--grid-jobs GRID_JOBS]
[--grid GRID] [--grid-partition GRID_PARTITION]
[--grid-benchmark {on,off}] [--grid-options GRID_OPTIONS]
[--grid-submit-sleep GRID_SUBMIT_SLEEP]
[--grid-environment GRID_ENVIRONMENT]
[--grid-scratch GRID_SCRATCH] [--grid-time-max GRID_TIME_MAX]
[--grid-mem-max GRID_MEM_MAX] [--dry-run] [--skip-nothing]
[--quit-early] [--until-task UNTIL_TASK]
[--exclude-task EXCLUDE_TASK] [--target TARGET]
[--exclude-target EXCLUDE_TARGET]
[--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
A workflow for whole metagenome shotgun sequences
options:
-h, --help show this help message and exit
--version show program's version number and exit
--input-extension {fastq.gz,fastq,fq.gz,fq,fasta,fasta.gz,fastq.bz2,fq.bz2,bam}
the input file extension
[default: fastq.gz]
--barcode-file BARCODE_FILE
the barcode file
[default: ]
--dual-barcode-file DUAL_BARCODE_FILE
the string to identify the dual barcode file
[default: ]
--index-identifier INDEX_IDENTIFIER
the string to identify the index files
[default: _I1_001]
--min-pred-qc-score MIN_PRED_QC_SCORE
the min phred quality score to use for demultiplexing
[default: 2]
--threads THREADS number of threads/cores for each task to use
[default: 1]
--pair-identifier PAIR_IDENTIFIER
the string to identify the first file in a pair, must proceed the file extension (ie R1_001.fastq.gz)
[default: .R1]
--interleaved indicates whether or not sequence files are interleaved
[default: False]
--bypass-quality-control
do not run the quality control tasks
--contaminate-databases CONTAMINATE_DATABASES
the path (or comma-delimited paths) to the contaminate
reference databases for QC
[default: /home/microviable/biobakery_workflows_databases/kneaddata_db_human_genome]
--qc-options QC_OPTIONS
additional options when running the QC step
[default: ]
--qc-scratch QC_SCRATCH
scratch space to be used when running the QC step
[default: ]
--functional-profiling-options FUNCTIONAL_PROFILING_OPTIONS
additional options when running the functional profiling step
[default: ]
--remove-intermediate-output
remove intermediate output files
--bypass-functional-profiling
do not run the functional profiling tasks
--bypass-strain-profiling
do not run the strain profiling tasks (StrainPhlAn)
--run-strain-gene-profiling
run the gene-based strain profiling tasks (PanPhlAn)
--bypass-taxonomic-profiling
do not run the taxonomic profiling tasks (a tsv profile for each sequence file must be included in the input folder using the same sample name)
--run-assembly run the assembly and annotation tasks
--strain-profiling-options STRAIN_PROFILING_OPTIONS
additional options when running the strain profiling step
[default: ]
--taxonomic-profiling-options TAXONOMIC_PROFILING_OPTIONS
additional options when running the taxonomic profiling step
[default: ]
--max-strains MAX_STRAINS
the max number of strains to profile
[default: 20]
--strain-list STRAIN_LIST
input file with list of strains to profile
[default: ]
--assembly-options ASSEMBLY_OPTIONS
additional options when running the assembly step
[default: ]
-o OUTPUT, --output OUTPUT
Write output to this directory
-i INPUT, --input INPUT
Find inputs in this directory
[default: /media/microviable/g/DATOSSECUENCIACION/Script test]
--config CONFIG Find workflow configuration in this folder
[default: only use command line options]
--local-jobs JOBS Number of tasks to execute in parallel locally
[default: 1]
--grid-jobs GRID_JOBS
Number of tasks to execute in parallel on the grid
[default: 0]
--grid GRID Run gridable tasks on this grid type
[default: slurm]
--grid-partition GRID_PARTITION
Partition/queue used for gridable tasks.
Provide a single partition or a comma-delimited list
of short/long partitions with a cutoff.
[default: serial_requeue,serial_requeue,240]
--grid-benchmark {on,off}
Benchmark gridable tasks
[default: on]
--grid-options GRID_OPTIONS
Grid specific options that will be applied to each grid task
--grid-submit-sleep GRID_SUBMIT_SLEEP
Number of seconds to wait between job submissions on grid
[default: 5]
--grid-environment GRID_ENVIRONMENT
Commands that will be run before each grid task to set up environment
--grid-scratch GRID_SCRATCH
The folder to write intermediate scratch files for grid jobs
--grid-time-max GRID_TIME_MAX
The max time allowed for a grid task (in minutes)
--grid-mem-max GRID_MEM_MAX
The max memory allowed for a grid task (in MB)
--dry-run Print tasks to be run but don't execute their actions
--skip-nothing Run all tasks. Rerun tasks that have already been run.
--quit-early Stop if a task fails. By default,
all tasks (except sub-tasks of failed tasks) will run.
--until-task UNTIL_TASK
Stop after running this task. Use task name or number.
--exclude-task EXCLUDE_TASK
Don't run these tasks. Add multiple times to append.
--target TARGET Only run tasks that generate these targets.
Add multiple times to append.
Patterns with ? and * are allowed.
--exclude-target EXCLUDE_TARGET
Don't run tasks that generate these targets.
Add multiple times to append.
Patterns with ? and * are allowed.
--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the level of output for the log
[default: INFO]
Any idea about this?
When I tried to install wmgx databases I received this
biobakery_workflows_databases --install wmgx --location /media/microviable/e/bwfdb
Installing humann utility mapping database
Download URL: http://huttenhower.sph.harvard.edu/humann_data/full_mapping_v201901b.tar.gz
CRITICAL ERROR: Unable to download and extract from URL: http://huttenhower.sph.harvard.edu/humann_data/full_mapping_v201901b.tar.gz
WARNING: Unable to install database. Error running command: humann_databases --download utility_mapping full /media/microviable/e/bwfdb/humann
Unable to find strainphlan install.
Strainphlan is installed
strainphlan --version
Mon Dec 18 17:39:20 2023: StrainPhlAn version 4.0.6 (1 Mar 2023)