ShortBRED for Nanopore data

Noah_Greenman · September 26, 2023, 2:31pm

shortbred_quantify.py v0.9.5

Hello! I have been attempting to use shortBRED to quantify relative gene abundance for a small set of genes (4 total) in ONT Nanopore sequence data.

I am curious if anyone has any advice regarding settings to optimize returns on hits without compromising the results? If anyone has any experience applying this tool to long read sequence data (particularly nanopore data) I’d be grateful to hear about it!

My most recent run used commands like so:

shortbred_quantify.py
–markers markers.faa
–wgs {nanopore_reads.faa}
–results results.txt
–tmp tmp_shortbred_dir
–usearch <usearch_path>

I have not yet tried playing with settings like --id or --minreadBP and so forth, but plan on doing so for optimization. Hence the call for aid.
Thank you!

franzosa · October 20, 2023, 6:57pm

Sorry for the slow reply! We don’t have much experience with nanopore reads, but off the top of my head I wouldn’t expect that you’d need to make major changes to the parameters. I know nanopore has a higher error rate than other methods, but I don’t think it’s high enough to merit lowering the --id threshold (for example).

Please report back if you learn anything interesting / have any helpful tips in this process as I’m sure it will be useful to other ShortBRED users!

Noah_Greenman · October 20, 2023, 7:23pm

Hello!

What we found was lowering the threshold did make an appreciable difference in the results. Given Nanopore’s R9 basecalling accuracy was on average Q9, we tried a threshold of 80% just to see what would happen. We also added a housekeeping gene (gyrA) to have a control marker. What surprised us was how before lowering the threshold, no detectable levels of gyrA were found.

Upon lowering the threshold, we saw some of our target genes emerge (which we knew were present from BLAST searches), but we decided to take an alternative approach because lowering the % identity almost felt like we were forcing an outcome.

I think shortBRED will benefit heavily with the newer R10 chemistry that brings nanopore basecalling up to 99%+.

From our experience, the best option is to find a % identity threshold a user is comfortable with. Given Nanopore’s accuracy (for R9 chemsitry), I’d say 80% feels the most “fair”.

franzosa · October 20, 2023, 8:09pm

Very interesting - thanks for the update and suggestions! Including gyrA as a control for read map-ability was very clever.

Topic		Replies	Views
ShortBRED identify gives no markers ShortBRED	3	779	January 4, 2022
Shortbred returns all zero count ShortBRED	1	475	November 5, 2021
ShortBRED yields zero hits for ALL samples ShortBRED	0	126	February 22, 2024
Wordparams.cpp(171) assert failed: MinFractId >= 0.0 && MinFractId <= 1.0 ShortBRED	1	321	July 1, 2022
About the ShortBRED category ShortBRED	0	641	November 12, 2019

ShortBRED for Nanopore data

Related topics