The bioBakery help forum

Paired-end data results in unpaired output


After running Kneaddata with Bowtie2 on paired-end data, the output I’m getting from the final output seems to be unpaired (the first read has over 9x the amount of reads as the second). I’m curious to know if there’s a way to force paired-end reads in the analysis and throw out any reads which are unpaired.

The command I ran was the following:
kneaddata --input sample_1.fastq --input sample_2.fastq --output /path/to/mydir --bypass-trim --run-trf -db /kneaddataGenome/SILVA_128_LSUParc_SSUParc_ribosomal_RNA


Hi, Thanks for the post. Kneaddata should by default track read pairs if a pair of input files are provided. You should see pair output files (with the same number of reads) and orphan files. I think in your case kneaddata is possibly having an issue tracking the pairs due to sequence identifiers of an unexpected format. Can you check to see if there are spaces in the sequence ids or possibly they are missing the read number?

On our end we will work on updating kneaddata to catch the case where sequence ids of an unexpected format are provided and throw an informative error message. Sorry for the confusion.

Thank you,

Hi Lauren,

Thank you so much for you reply. I changed the sequence identifiers and sure enough, that solved the issue. Thanks again!