Conda installed workflow: topological_sort() & unexpected keywork argument "reverse"

I have run the following comands to create a conda env with the biobakery workflow.
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --add channels biobakery
conda install -c biobakery biobakery_workflows

conda install python=3.7
conda install -c biobakery leveldb

### run demo
biobakery_workflows_databases --install wmgx_demo
wget https://github.com/biobakery/biobakery_workflows/raw/master/examples/tutorial/input/HD42R4_subsample.fastq.gz
wget https://github.com/biobakery/biobakery_workflows/raw/master/examples/tutorial/input/HD32R1_subsample.fastq.gz
wget https://github.com/biobakery/biobakery_workflows/raw/master/examples/tutorial/input/HD48R4_subsample.fastq.gz
wget https://github.com/biobakery/biobakery_workflows/raw/master/examples/tutorial/input/LD96R2_subsample.fastq.gz
wget https://github.com/biobakery/biobakery_workflows/raw/master/examples/tutorial/input/LV16R4_subsample.fastq.gz
wget https://github.com/biobakery/biobakery_workflows/raw/master/examples/tutorial/input/LV20R4_subsample.fastq.gz
mv *fastq.gz fastq/


## run demo 
biobakery_workflows wmgx --input fastq/  --output demo_output/ --bypass-strain-profiling --local-jobs 6 --threads 2

On the last line of code (biobakery_workflows), i get the following error:

Traceback (most recent call last):
      File "/    home/emilyw/.conda/envs/biobakery/bin/wmgx.py", line 183, in <module>
            workflow.go()
          File "/home/emilyw/.conda/envs/biobakery/lib/python3.7/site-packages/anadama2/workflow.py", line 775, in go
            task_idxs = nx.algorithms.dag.topological_sort(self.dag, reverse=True)
        **TypeError: topological_sort() got an unexpected keyword argument 'reverse'**

Based on similar issues on other packages, it looks like the issue might be with some version inconsistency? Maybe a python/leveldb issue?

Hi - Thank you for your detailed post! I think you are right that the error you are seeing is likely due to a version of a dependency. If you roll back your install of the python package networkx to < v2.0 it should resolve the issue. With the v2.0 release it looks like the networkx package modified the topological_sort function to no longer allow the reverse keyword argument and so you see an error with the newer package. We will work on our end to update the AnADAMA2 software to work with the latest networkx release.

Thank you,
Lauren

1 Like

Got it! Now I’m having an issue where the biobakery_workflow is getting hung up on the environment solve. I deleted the old conda env and created a new one with the following cold.

conda create -n biobakery python=3.7
conda activate biobakery
pip install networkx==1.11

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --add channels biobakery

conda install -c biobakery biobakery_workflows  ## this is the step that will not process

Do I need to downgrade python to 2.7? I tried this and the env got hung up for a long time as well.

Great! No need to downgrade to python v2.7; you should be okay using python 3.7. The workflows do have a number of dependencies so the install might take a bit of time with conda. If you are familiar with Docker we host a workflows Dockerfile with all the dependencies.

Thank you,
Lauren

1 Like

Great! I was able to set it the environment by removing the python version flag all together (with the flag, it would not resolve).

For anyone who may have this issue later, this code worked:

conda create -n metagenome 
conda activate metagenome
conda config --add channels biobakery
conda install -c biobakery biobakery_workflows

Update: I thought this worked but it doesn’t! Both the demo and full wmgx database don’t download (see below) and I am getting the topological_sort() error when I try to run anyway with the bit of the database that downloads successfully.

Python is version 3.8.6.

biobakery_workflows_databases --install wmgx
Installing humann2 utility mapping database
Download URL: http://huttenhower.sph.harvard.edu/humann2_data/full_mapping_1_1.tar.gz
Downloading file of size: 0.58 GB

0.58 GB 100.00 %   9.28 MB/sec  0 min -0 sec
Extracting: /home/emilyw/biobakery_workflows_databases/humann2/full_mapping_1_1.tar.gz

Database installed: /home/emilyw/biobakery_workflows_databases/humann2/utility_mapping

HUMAnN2 configuration file updated: database_folders : utility_mapping = /home/emilyw/biobakery_workflows_databases/humann2/utility_mapping
Generating strainphlan fasta database
Could not locate a Bowtie index corresponding to basename "/home/emilyw/.conda/envs/metagenome/bin/metaphlan_databases/mpa_v20_m200"
Error: Encountered internal Bowtie 2 exception (#1)
Command: /home/emilyw/.conda/envs/metagenome/bin/bowtie2-inspect-s --wrapper basic-0 /home/emilyw/.conda/envs/metagenome/bin/metaphlan_databases/mpa_v20_m200
Unable to install database. Error running command: b o w t i e 2 - i n s p e c t   / h o m e / e m i l y w / . c o n d a / e n v s / m e t a g e n o m e / b i n / m e t a p h l a n _ d a t a b a s e s / m p a _ v 2 0 _ m 2 0 0   >   / h o m e / e m i l y w / b i o b a k e r y _ w o r k f l o w s _ d a t a b a s e s / s t r a i n p h l a n _ d b _ m a r k e r s / a l l _ m a r k e r s . f a s t a

Then the topological_sort() error, the same as the initial error which prompted this post:

biobakery_workflows wmgx --input fastq/ --output demo_output/ --bypass-strain-profiling --local-jobs 6 --threads 2
Traceback (most recent call last):
  File "/home/emilyw/.conda/envs/metagenome/bin/wmgx.py", line 183, in <module>
    workflow.go()
  File "/home/emilyw/.conda/envs/metagenome/lib/python2.7/site-packages/anadama2/workflow.py", line 775, in go
    task_idxs = nx.algorithms.dag.topological_sort(self.dag, reverse=True)
TypeError: topological_sort() got an unexpected keyword argument 'reverse'

Hi - Thank you for the update and sorry that you are still seeing errors with the install. It looks like the workflow has an error when trying to build the StrainPhlAn databases. Is the MetaPhlAn database installed? Sorry if that database dependency is confusing. We will update the code in the future to install the MetaPhlAn databases by default when the StrainPhlAn databases are installed and also to provide a better error message if the MetaPhlAn database can not be found.

The other error looks to be from a package installed with python 2.7 but it sounds like you are running everything else with python 3. Can you check your environment to make sure you don’t have two installs of networkx? If not, if you can install the older version of networkx again that should resolve the issue.

Thank you,
Lauren

I used conda install -c bioconda networkx=1.11.

1 Like