16S workflow install

Dear Biobakery team

I have been trying for 3 days to overcome the incompatibilities of the 16S workflow. Finally I am reaching out for help. I am trying to install the 16S workflow on a shared HPC without sudo access. I know there has been another post on this [FileNotFoundError when installing 16s databases] but the solution proposed did not work for me and I’m not sure if it’s better etiquette to start a new thread or hijack another one. Sorry if I’m duplicating. The proposed solution with the conda yml unfortunately just sat for hours trying to solve the environment with no success.

There just doesn’t seem to be a way to install the 16S databases and the R requirements in one conda environment. The conda download installs python 2.7 and R3.5, which means that various R packages don’t work. Using the pip install approach with a python >3 venv means that picrust doesn’t work. Lauren Miciver has previously on the aforementioned post suggested installing picrust1 separately. I have installed picrust1 (using a python 2.7 venv) and it has the utility script download_picrust_files.py but it doesn’t work because the pip environment is using python 3.6 (I didn’t install picrust from source - I used their pip install - perhaps that is a mistake on my part?). I’ve installed picrust 2, which is listed as an option on your main github page for the 16S workflows, but it doesn’t come with the download script.

I have all the R libraries installed in my local user R (4.3) and all the packages installed in the pip venv.

I’m happy to forgo functional profiling with picrust. I just want to install the 16S databases to run dada2. I’m guessing if I can download the required libraries using python 2.7 and picrust1 then I must be able to place them in the venv (install_workflows). But where? Given that the biobakery workflows are now all python 3 compatible is there perhaps a yml file for conda for python 3 and critically for R >4 so that ggplot2/gridextra can work? I tried making my own but it eventually gave up with conflicts.

BTW I have successfully used the wmgx pipeline and it worked wonderfully.

Thank you!

Kind Regards

Hi Jonathan, Thanks for your detailed post! Since you are on a shared HPC platform without sudo access I would recommend installing the tools with pip. If you install the latest workflows it is compatible with the latest version of PICRUSt, all of which should work with python 3. For DADA2 with this approach you would have to manually install the DADA2 package in R. Let me know if this sounds like it will work for you and if so, please reach out with any questions!


Dear Lauren

Thank you so much for your reply!

Just wondering how I would download the databases? Or are saying that something has changed since I posted this query and that now it will install using PICRUSt 2? Sorry for the confusion.

Kind Regards