Hello,
I’ve been using kneaddata v0.7.10
to discard human and host DNA (other animals) from metagenomic data. I tested a paired-reads dataset from pigs with the human database from kaneaddata
, and I got around 9 times more reads in one fastq compared to its pair. I also tried --bypass-trf
, but I got the same problem. When I used kneaddata
without a database for decontamination (so I applied trimming and trf
), I got the same amount of counts for both members of a pair as expected, so it seems that the problem arises when a database is used.
The IDs in the input files look like this:
@ERR1855536.1 NS500633:37:H3VL5BGXY:1:11101:10093:1034/2
@ERR1855536.2 NS500633:37:H3VL5BGXY:1:11101:16963:1035/2
@ERR1855536.3 NS500633:37:H3VL5BGXY:1:11101:17200:1037/2
@ERR1855536.4 NS500633:37:H3VL5BGXY:1:11101:5816:1038/2
The ids in the result files look like this:
@ERR1855536.22
@ERR1855536.32
@ERR1855536.52
@ERR1855536.62
@ERR1855536.72
@ERR1855536.82
The command used:
$ kneaddata --remove-intermediate-output --threads 32 --input {2} --input {3} \
--output $out_folder --reference-db $ref --sequencer-source NexteraPE \
--trimmomatic-options "SLIDINGWINDOW:4:20 MINLEN:50" --trimmomatic \
$trimmo_path --bowtie2-options "--very-sensitive --dovetail"
I’m attaching a multiqc
report of these runs, all with the same problem
ERR1855535
ERR1855536
ERR1855537
ERR1855538
Edit: I had the latest version installed in the cluster, 0.10
, but for some reason, the environment activation isn’t working. Anyways, I’m making sure I’m executing the latest version now and I got the following error (which I posted in a different thread):
kneaddata_bowtie2_discordant_pairs: error: unrecognized arguments: --mode strict
I’ll update this post when I can run the latest version.