Hello Dr. Franzosa,
Thank you very much for your kind reply! I have double checked and it appears that my marker.faa file contains nucleotide sequences. My original input looks like I was using the nucleotide fasta protein homolog model instead of the protein fasta protein homolog model, so I have edited that below:
#1. Make markers from CARD
shortbred_identify.py --goi protein_fasta_protein_homolog_model.fasta --ref uniref100.fasta --markers my_markers.faa
I am getting an error that says the following:
One or more of the sequences in your input file has an id that ShortBRED cannot use as a valid folder name during the clustering step, so ShortBRED has stopped. Please edit ** protein_fasta_protein_homolog_model.fasta ** to remove any slashes,asterisks, etc. from the fasta ids. The program utils/AdjustFastaHeadersForShortBRED.py in the ShortBRED folder can do this for you. ShortBRED halted on this gene/protein:gb|ACT97415.1|ARO:3002999|CblA-1
I suppose at this point I just need to parse the output marker.faa to remove the slashes and underscores. Is this a common step? Hoping that this fixes the error and I can get what I need to move forward without any hiccups.
Thank you so much for your attention, please let me know if I can provide any more information that may help you in troubleshooting!