Virulence Factor Annotations

Hi,

I’m using Shortbred to receive virulence factors. I use the provided database. Some family annotations include a description of the virulence factor or an id with which I can find it via NCBI. But especially the information derived from the Mvir database has often no helpful information attached or better said I have no idea how to use the given information to get information on the virulence factor. I tried blasting the protein sequences but this results in a lot of matches and I don’t really know how to find the fitting one.

So I have two questions:

  1. How can I get information about the results that have no gb/gi id and no textual description?
  2. Is there a way to nicely attach the virulence factor information without searching for each id manually?

Examples:

virulence|9992|vfid|16697|vsiid|21770|ssid|RecName__virulence|25075|vfid|60054|vsiid|80921|ssid|tetracycline_virulence|25078|vfid|60060|vsiid|80927|ssid|RecName__TM_#01
MNRTVMMALVIIFLDA
→ Here I have the “tetracycline_virulence” information which is great. I extract those descriptions via regular expressions.

VFDB|VFG000049(gb|NP_880889)_virulence|13251|vfid|20418|vsiid|39888|ssid|type_virulence|13251|vfid|20418|vsiid|39889|ssid|type
→ Here I have NP_880889, which is not optimal but still good.

virulence|9883|vfid|16588|vsiid|21130|ssid|SubName_
→ I have no idea how to get any information about this virulence factor.

Thanks in advance!

I am interested in this question also.

1 Like

I would also like clarification