Shortbred identify performance with CARD and Uniref90

Hi,

I want to create a set of markers with shortbred-identify using the newly released CARD database (May 2020 version) as my proteins of interest and Uniref90 as my reference proteins sequences.

It seems that the process is going to take a while as my first try wasn’t over after three weeks. So my questions are :

  1. Do you already have those markers created and available (such as the mid-2017 ones) ?
  2. Do you have any rough idea about how long it can take ?
  3. Is there any way to speed up the process ? What about splitting the CARD fasta file and running multiple processes on a computer grid (i.e. one process per split fasta) ?

Thanks,
Erwan

1 Like