FileNotFoundError when installing 16s databases

Hello,
I followed the instructions to install the dependencies and databases for the 16s workflow. I am getting the following error:
“”"
Traceback (most recent call last):
File “/cluster/tufts/bio/tools/conda_envs/biobakery/bin/biobakery_workflows_databases”, line 10, in
sys.exit(main())
File “/cluster/tufts/bio/tools/conda_envs/biobakery/lib/python3.7/site-packages/biobakery_workflows/biobakery_workflows_databases.py”, line 244, in main
run_command([“download_picrust_files.py”])
File “/cluster/tufts/bio/tools/conda_envs/biobakery/lib/python3.7/site-packages/biobakery_workflows/biobakery_workflows_databases.py”, line 73, in run_command
id=subprocess.check_call(command, shell=shell)
File “/cluster/tufts/bio/tools/conda_envs/biobakery/lib/python3.7/subprocess.py”, line 342, in check_call
retcode = call(*popenargs, **kwargs)
File “/cluster/tufts/bio/tools/conda_envs/biobakery/lib/python3.7/subprocess.py”, line 323, in call
with Popen(*popenargs, **kwargs) as p:
File “/cluster/tufts/bio/tools/conda_envs/biobakery/lib/python3.7/subprocess.py”, line 775, in init
restore_signals, start_new_session)
File “/cluster/tufts/bio/tools/conda_envs/biobakery/lib/python3.7/subprocess.py”, line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: ‘download_picrust_files.py’: ‘download_picrust_files.py’
“”"
I have picrust2 installed:
picrust2 2.0.3_b py_0 bioconda

but I was unable to install picrust (1) because of unresolvable conda conflicts between picrust and biobakery_workflows.

I would appreciate any help on resolving this.

Thank you,
Rebecca

1 Like

In addition, I get the same error when using the Docker container. Apologies if I’m doing something basic wrong and thanks for your help.
“”"
root@e8ebf633d935:/# biobakery_workflows_databases --install 16s_usearch
Downloading green genes database files
Downloading ftp://greengenes.microbio.me/greengenes_release/gg_13_8_otus/rep_set/97_otus.fasta
Downloading file of size: 136.58 MB

136.59 MB 100.00 % 5.19 MB/sec 0 min -0 sec

Downloading ftp://greengenes.microbio.me/greengenes_release/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
Downloading file of size: 9.61 MB

9.61 MB 100.01 % 4.66 MB/sec 0 min -0 sec

Traceback (most recent call last):
File “/usr/local/bin/biobakery_workflows_databases”, line 11, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/biobakery_workflows/biobakery_workflows_databases.py”, line 244, in main
run_command([“download_picrust_files.py”])
File “/usr/local/lib/python3.6/dist-packages/biobakery_workflows/biobakery_workflows_databases.py”, line 73, in run_command
id=subprocess.check_call(command, shell=shell)
File “/usr/lib/python3.6/subprocess.py”, line 306, in check_call
retcode = call(*popenargs, **kwargs)
File “/usr/lib/python3.6/subprocess.py”, line 287, in call
with Popen(*popenargs, **kwargs) as p:
File “/usr/lib/python3.6/subprocess.py”, line 729, in init
restore_signals, start_new_session)
File “/usr/lib/python3.6/subprocess.py”, line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: ‘download_picrust_files.py’: ‘download_picrust_files.py’
“”"

Hi Rebecca, The workflows use the PICRUSt v1 script to install its databases. The workflows have the option to run either PICRUSt 1 or 2. If you install PICRUSt v1 it will have the script that the workflows are looking for and it should resolve the error you are seeing.

Thank you,
Lauren

Thank you for your quick and helpful reply. I would like to install picrust 1 but I seem to have dependency conflicts. I’ve tried with both python 2.7 and python 3.9, and can’t seem to get both biobakery_workflows and picrust 1 installed in a conda env. This is what I see:

(/cluster/tufts/bio/tools/conda_envs/biobakery) rbator01@c1cmp047:~$ conda install -c bioconda picrust
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: - 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abor\ 
failed                                                                                                                       

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - picrust -> python[version='2.7.*|>=2.7,<2.8.0a0']

Your python: python=3.7

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

Thank you for the detailed error messages. I am not sure if PICRUSt v1 is python3 compatible so you might want to use python 2.7. If there are no compatible conda packages for PICRUSt v1 you might consider installing it from source as it is a easy and straightforward process that is well documented (and only a couple commands). Installing PICRUSt — PICRUSt 1.1.4 documentation

Thank you,
Lauren

Hello,
I have the same problem. Missing file download_picrust_files.py after insatling conda biobackery environment.
Cant have biobackery workflow and picrust v1 in a same environment (python version).
I cant install from source picrust v1 on my cluster. Is there alternative solution? I want to try biockary 16s workflow (dada2 and usearch).

Thanks in advance for the help.

Regards

Hi! I have the same issue, how did you solve it?
Thank you.

After stumbling on the same error (to the team: please, please update your READMEs/tutorial to have accurate infos…), here is how I proceeded:

Create a biobakery_py27_16S.yml file to define a conda environment

name: biobakery_py27_16S

channels:
  - bioconda
  - biobakery

dependencies:
  - python=2.7
  - picrust
  - biobakery_workflows
  - clustalo

The environment installed properly with mamba (did not test with conda).
biobakery_workflows_databases --install 16s_dada2 then worked without error.

Well… The database install worked, but the pipeline itself does not run.

~/sciliciumtheo$ biobakery_workflows 16s --input test_data --method dada2 --output bb_dada2
Traceback (most recent call last):
  File "/sciliciumtheo/miniconda3/envs/biobakery_py27_16S/bin/16s.py", line 184, in <module>
    workflow.go()
  File "/sciliciumtheo/miniconda3/envs/biobakery_py27_16S/lib/python2.7/site-packages/anadama2/workflow.py", line 803, in go
    _runner.run_tasks(task_idxs)
  File "/sciliciumtheo/miniconda3/envs/biobakery_py27_16S/lib/python2.7/site-packages/anadama2/runners.py", line 144, in run_tasks
    self.ctx._handle_task_result(result)
  File "/sciliciumtheo/miniconda3/envs/biobakery_py27_16S/lib/python2.7/site-packages/anadama2/workflow.py", line 822, in _handle_task_result
    self._backend.save(result.dep_keys, result.dep_compares)
  File "/sciliciumtheo/miniconda3/envs/biobakery_py27_16S/lib/python2.7/site-packages/anadama2/backends.py", line 149, in save
    decoded_val.append(v.decode("utf-8"))
  File "/sciliciumtheo/miniconda3/envs/biobakery_py27_16S/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 43: ordinal not in range(128)

Making progress…

One of the problem was with one of DADA2 dependencies that is buggued. Fixing the version to a previous one fixed it.
Here is the conda configuration file that I used to create the environment:

name: biobakery_py27_16S

channels:
  - bioconda
  - biobakery
  - conda-forge

dependencies:
  - python=2.7
  - picrust
  - biobakery_workflows
  - bioconductor-dada2
  - r-gridextra
  - r-ggplot2
  - r-seqinr
  - r-matrix=1.3_2
  - clustalo
  - fasttree
  - ea-utils

The 16S workflow (DADA2 version) runs normally with this configuration. Command used:
biobakery_workflows 16s --input test_data --method dada2 --output bb_dada2 --bypass-functional-profiling --bypass-primers-removal --log-level DEBUG --picrust-version 1

As a note for the devs, the option --bypass-functional-profiling seems to have no effect (I was forced to tell the workflow which picrust version to use, even though it should not have to use it…).

Hello @SciLiciumTheo , Thanks for your detailed post and follow up! Sorry for the slow response on our end. Your most recent error looks to be a python2/3 issue. The latest version of the workflows is python3 compliant. If you would consider upgrading to python3.7 it might make it easier to install. If you are required to use python2.7 (which I totally understand as we sometimes have to be backwards compatible too), please post if you run into another issues.

Yes, you are right. For older versions of the workflows PICRUSt was not compatible with DADA2 so functional profiling was not run by default. Sorry for any confusion.

Thanks!
Lauren