The bioBakery help forum

Problem with NovaSeq sequenced data


I had a question regarding using kneaddata (v0.7.10) on paired end data sequenced using a NovaSeq. Your tool works very nicely (thanks for designing it!) for sequence data with standard Phred scores, but I have issues when working with aggregated Phred scores (in my case only 4 score bins for indicating quality of a base (2,12,23,37)), see the figure below:

While the tool still seems to run fine, I get completely empty cleaned files (the ones ending in kneaddata_paired_1.fastq). I guess something goes wrong in the trimming due to the quality scores the tool perhaps does not expect, although I am not sure? Hope my question is clear, if not, please let me know. Looking forward to hearing from you!