Dear bioBakery team,
I am using the kneaddata-v0.12.0 from bioconda for QC of my metagenomic raw data.
To date, kneaddata has worked perfectly for all my purposes.
However, when I recently tried profiling my QCed data using mOTUs4, I got an error for [kneaddata_result]_paired_1.fastq
and [kneaddata_result]_paired_2. fastq
, indicating that they are not paired.
This error is due to a mismatch in the header between forward and reverse reads.
This problem did not occur with MetaPhlAn and Kraken, which I have been using for a long time.
In my case, I have compromised to proceed with the next step of mOTUs by removing the inconsistent reads. Since there were only 88 unmatched reads in each forward and reverse reads, this is a very small number compared to the total of 31,276,232 reads.
I assume that this is because bowtie2 runs in unpaired mode within kneaddata, but it is only an guess.
I believe that it would be not a significant issue for bowtie2 to run in unpaired mode since the size of my data is large enough, but I am unsure why such a small number of reads are avoided in the process of matching pairs.
In this case, the number of unpaired reads was minimal, but I am concerned that the number might be large to ignore in other cases of my data.
I would be grateful if you could give me your opinion on this.
I have attached the kneaddata log of the data I used and unique headers detected when tried running mOTUs4.
stool_SG_001_1_kneaddata.log.txt (28.2 KB)
unique_headers_detected.txt (6.7 KB)
Best regards,
Kirby