Hi everyone,
I’m interested in using ShortBRED to identify the most most abundant genes in my metagenomic dataset using a reference protein database. Here’s a breakdown of what I’ve done so far:
- Installed ShortBRED by downloading the file from GitHub: ShortBRED GitHub Repository
- Loaded dependencies including CD-HIT, BLAST+, USEARCH, MUSCLE, and NumPy with the following module loads:
- module load CBI miniconda3/23.5.2-0-py311
- module load Sali blast+
- module load Sali usearch
- module load Sali muscle
- module load python3/numpy/1.19.5
Currently, I’m following the ShortBRED tutorial on GitHub, specifically using method 1 to generate new markers from a set of proteins of interest and a reference set of proteins. However, I’ve hit a couple of roadblocks along the way and could use some assistance in resolving them.
Error 1
python /wynton/group/lynch/c-gomez/software/shortbred/shortbred_identify.py --goi example/input_prots.faa --ref example/ref_prots.faa --markers mytestmarkers.faa --tmp example_identify --cdhit /wynton/group/lynch/c-gomez/software/cd-hit-v4.8.1-2019-0228/cd-hit
Traceback (most recent call last):
File "/wynton/group/lynch/c-gomez/software/shortbred/shortbred_identify.py", line 56, in <module>
from Bio.Alphabet import IUPAC
File "/wynton/home/lynchlab/c-gomez/.local/lib/python3.11/site-packages/Bio/Alphabet/__init__.py", line 20, in <module>
raise ImportError(
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the ``molecule_type`` as an annotation on a SeqRecord for your script to work correctly. Please see https://biop
I managed to resolve it by commenting out the import of Bio.Alphabet in the shortbred_identity.py script.
Error 2: After that, I encountered another issue, which seems to be related to the deprecation of the dumb_consensus method in Biopython. However, I’m unsure how to address this problem effectively.
Here’s the error message I’m currently dealing with:
Making BLAST database for the family consensus sequences...
Traceback (most recent call last):
File "/wynton/group/lynch/c-gomez/software/shortbred/shortbred_identify.py", line 279, in <module>
subprocess.check_call([
File "/wynton/home/cbi/shared/software/CBI/miniconda3-23.5.2-0-py311/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['makeblastdb', '-in', 'example_identify/clust/clust.faa', '-out', 'example_identify/clustdb/goidb', '-dbtype', 'prot', '-logfile', 'example_identify/goidb.log']' died with <Signals.SIGBUS: 7>.
Has anyone run into this issue?
Thank you in advance for the assistance!
Best,
Carlos