When running kneaddata, my paired input is ending up as entirely unpaired. I have tried adding “/1” and “/2” as a suffix to the read headers, but the kneaddata_paired.fastq output files are blank while kneaddata_unpaired.fastq contain all the reads.
It appears that trimmomatic is doing just fine with tracking R1/R2 pairs, but sometime after that step the pairs get lost.
My files are named Sample1.R1.fastq.gz and Sample1.R2.fastq.gz
An example of my R1 and R2 file headers:
@A00261:821:H75NNDSX7:2:1101:2862:10001:N:0:ATAGCGAAGA+CGCACATTGT/1
@A00261:821:H75NNDSX7:2:1101:2862:10002:N:0:ATAGCGAAGA+CGCACATTGT/2
Here is my kneaddata command:
kneaddata --input1 Sample1.R1.fastq.gz --input2 Sample1.R2.fastq.gz -db /pathtodb/kneaddata_db --output Sample1_kneaddata_output
Here is a count of reads in all the output files created:
R1 | R2 | |
---|---|---|
Input Reads | 85,362,781 | 85,362,781 |
Sample1_bowtie2_paired_contam.fastq | - | - |
Sample1_bowtie2_unmatched.fastq | 23,682 | 6,117 |
Sample1_kneaddata_paired.fastq | - | - |
Sample1_kneaddata.repeats.removed.fastq | 80,698,325 | 80,698,490 |
Sample1_kneaddata.repeats.removed.unmatched.fastq | 1,909,623 | 1,291,362 |
Sample1_kneaddata.trimmed.fastq | 81,153,744 | 81,153,744 |
Sample1_kneaddata.trimmed.single.fastq | 1,921,519 | 1,299,530 |
Sample1_kneaddata_unmatched.fastq | 82,584,266 | 81,983,735 |
And version info:
kneaddata v0.12.0
Any help would be greatly appreciated! It’s ok if this is just a bug with kneaddata because I just plan on concatenating the R1/R2 reads to use in Humann analysis, but I do want to make sure that something isn’t going wrong with my analysis before I move onto the next step.