Biobakery Workflow database

Hi, so I’m having bit of an issue using biobakery workflow. The main issue is that the database (for wgmx) is stored in a different path due to memory related reasons. How do I make biobakery workflow use the database stored elsewhere?

Hello, You can set environment variables to point to databases that are in a custom location.

If all databases are under one custom folder set the variable: $BIOBAKERY_WORKFLOWS_DATABASES .

Alternatively if you have multiple custom folders, then set one or more of the environment variables:
$KNEADDATA_DB_HUMAN_GENOME, $KNEADDATA_DB_RIBOSOMAL_RNA, $KNEADDATA_DB_HUMAN_TRANSCRIPTOME, $STRAINPHLAN_DB_REFERENCE, and $STRAINPHLAN_DB_MARKERS.

If your HUMAnN databases are stored in a custom folder the HUMAnN configuration file gets updated when you select the folder and download the database through the HUMAnN database tool so there is nothing you would need to do. If your MetaPhlAn database is installed in a custom folder you would need to provide that location as a --metaphlan-option with the --index option when running the workflow.

Please post if you have additional questions or are still running into issues.

Thanks!
Lauren

Hi Lauren,
Just a followup to this.

Does running the below code install all needed and updated databases (ie if we were to run Metaphlan 4 through the workflow and wanted the updated databases)?
$ biobakery_workflows_databases --install wmgx ’

Or do we need to download the individual databases and provide location?

Hi Arya, Yes that should install all the required databases for the wmgx workflow except MetaPhlAn. The MetaPhlAn database will be installed the first time you run it. Alternatively, you can install that database by running $ metaphlan --install.

Thanks!
Lauren