There are small unmatched reads in paired reads of kneaddata result

Dear bioBakery team,

I am using the kneaddata-v0.12.0 from bioconda for QC of my metagenomic raw data.

To date, kneaddata has worked perfectly for all my purposes.

However, when I recently tried profiling my QCed data using mOTUs4, I got an error for [kneaddata_result]_paired_1.fastq and [kneaddata_result]_paired_2. fastq, indicating that they are not paired.

This error is due to a mismatch in the header between forward and reverse reads.
This problem did not occur with MetaPhlAn and Kraken, which I have been using for a long time.

In my case, I have compromised to proceed with the next step of mOTUs by removing the inconsistent reads. Since there were only 88 unmatched reads in each forward and reverse reads, this is a very small number compared to the total of 31,276,232 reads.

I assume that this is because bowtie2 runs in unpaired mode within kneaddata, but it is only an guess.

I believe that it would be not a significant issue for bowtie2 to run in unpaired mode since the size of my data is large enough, but I am unsure why such a small number of reads are avoided in the process of matching pairs.

In this case, the number of unpaired reads was minimal, but I am concerned that the number might be large to ignore in other cases of my data.

I would be grateful if you could give me your opinion on this.

I have attached the kneaddata log of the data I used and unique headers detected when tried running mOTUs4.

stool_SG_001_1_kneaddata.log.txt (28.2 KB)
unique_headers_detected.txt (6.7 KB)

Best regards,
Kirby

I found that this problem was caused by mOTUs4 not kneaddata.
I checked for “differing read headers” that mOTUs4 pointed to for kneaddata results by manually and using BBMap.
I could not find any mismatched headers.

I have flagged this post to the admin.

Thanks,
Kirby