Kneaddata outputs

Joao_Gatica · July 4, 2021, 8:09am

Hi
I try to understand the output files generated in kneaddata for a paired-end sample

Files:
Sample_1 (Forward) (11G)
Sample_2 (Reverse) (9.7G)

I obtain:
Sample_kneaddata.log
Sample_1.kneaddata.repeats.removed.1.fastq (8.6G)
Sample_2.kneaddata.repeats.removed.2.fastq (7.2G)
Sample_1.kneaddata.repeats.removed.unmatched.1.fastq (1.2G)
Sample_2.kneaddata.repeats.removed.unmatched.2.fastq (85.9MB)
Sample_1.kneaddata.trimmed.1.fastq (8.7G)
Sample_2.kneaddata.trimmed.2.fastq (7.3G)
Sample_1.kneaddata.trimmed.single.1.fastq (1.2G)
Sample_2.kneaddata.trimmed.single.2.fastq (86.3 MB)

I suppose that Sample_1.kneaddata.repeats.removed.1.fastq (and _2) are the files that I need to continue to the next step in the analysis, due to the file size (smaller than trimmed files). It is right?
Also, I don’t understand what is single in “Sample_X.kneaddata.trimmed.single.X.fastq” and why I obtain a so important difference between file size _1 (1.2G) and _2 (86.3MB)

I appreciate your help
All the best,
Joao

sagunmaharjann · July 9, 2021, 5:09pm

Hi @Joao_Gatica,

Yes, you are correct that Sample_1.kneaddata.repeats.removed.1.fastq (and _2) are the files that
you need to continue to the next step in the analysis. Since our workflow run Trimmomatic → TRF → Bowtie2 in this order, for the latest version of Kneaddata, Sample_1.kneaddata.repeats.removed.1.fastq (and _2) are the results of the TRF step.

“Sample_X.kneaddata.trimmed.single.X.fastq” are the sequences that were trimmed from the samples in the Trimmomatic step. Please see the default Trimmomatic setting that we are currently using for the Kneaddata here (kneaddata · biobakery/biobakery Wiki · GitHub).

[ DEFAULT : ILLUMINACLIP:/TruSeq3-SE.fa:2:30:10 SLIDINGWINDOW:4:20 MINLEN:50 ]

We use adapter trimming, sliding window and minimum bp length value. I assume that there is a bit of a difference in the _R1 vs _R2 read length which is causing this inconsistency? file size _1 (1.2G) and _2 (86.3MB)

Regards,
Sagun

Topic		Replies	Views
How to understand the output file? KneadData	5	2809	April 16, 2020
Strange output from paired end kneaddata input KneadData	2	2163	August 28, 2020
Size of the paired Kneaddata output file is 0 KneadData	3	69	October 23, 2024
Limit input file size when running kneaddata KneadData	0	55	May 20, 2024
Paired End Run output explanation KneadData	1	49	August 22, 2024

Kneaddata outputs

Related topics