Kneaddata Outputting Disordered Paired Reads

Hello,

We are running Kneaddata to decontaminate our illumina paired end sequences and are having issues with the decomtaminated paired read files where the reads aren’t in the same order in both files.

We are working with a conda install of Kneaddata from the bioconda channel.

Kneaddata version 0.12.2 build pyhdfd78af_0

Running the command:

$ kneaddata --input1 ${PREFIX}_L001_R1_001.fastq --input2 ${PREFIX}_L001_R2_001.fastq -db Kneaddatabase/Armadilloref_mDasNov1hap2_db --output ${PREFIX} --trimmomatic /usr/local/apps/Trimmomatic-0.39/ --trimmomatic-options=“SLIDINGWINDOW:4:12” --trimmomatic-options=“MINLEN:30” --bypass-trf -t 14

When we try to run bwa mem using the output decontaminated paired reads, we are met with this error.

BWA error:

[mem_sam_pe] paired reads have different names: “VL00587:30:AAGGKWHM5:1:1101:19367:1000#0”, “VL00587:30:AAGGKWHM5:1:1101:20314:1000#0”

When we run bwa on the unprocessed paired reads, it runs through just fine.

When we look at the read names for the files, this is what we get

$ grep “@” *paired_1.fastq | head -n 5
@VL00587:30:AAGGKWHM5:1:1101:19367:1000#0/1
@VL00587:30:AAGGKWHM5:1:1101:20314:1000#0/1
@VL00587:30:AAGGKWHM5:1:1101:22208:1000#0/1
@VL00587:30:AAGGKWHM5:1:1101:23647:1000#0/1
@VL00587:30:AAGGKWHM5:1:1101:24139:1000#0/1

$ grep “@” *paired_2.fastq | head -n 5
@VL00587:30:AAGGKWHM5:1:1101:20314:1000#0/2
@VL00587:30:AAGGKWHM5:1:1101:22208:1000#0/2
@VL00587:30:AAGGKWHM5:1:1101:23647:1000#0/2
@VL00587:30:AAGGKWHM5:1:1101:24139:1000#0/2
@VL00587:30:AAGGKWHM5:1:1101:24404:1000#0/2

When counting the number of reads in the files we get the same number for both files, and when we grep the paired_2.fastq file for the mismatched header (@VL00587:30:AAGGKWHM5:1:1101:19367:1000#0), it shows up in the file.

Has anyone else experienced this or know how we could fix it?

Thanks!
Taylor