Kneaddata final pair1 : Total reads after merging results from multiple databases

Banseok212 · August 22, 2023, 9:11am

Hi I have preprocessed rawdata with kneaddata for whole meta transcriptome analysis. After that, I browsed the log file to see the read counts at each step.

Here’s where I found something I don’t understand. The description of the resulting final pair file read “Total reads after merging results from multiple databases”.

Does this mean that only Reads that can be merged in the results from both databases have been sorted?

For example, in the table below, with 24801812 reads from the human mRNA DB and 840881 reads from the rRNA DB, would the overlapping read count be 840505?

Below are the files I identified and their read counts.:

	FileName	ReadCount
raw pair1	Initial number of reads	34378835
trimmed pair1	Total reads after trimming (meta_test_R1_kneaddata.trimmed.1.fastq )	28814562
decontaminated human_hg38_refMrna pair1	Total reads after removing those found in reference database (meta_test_R1_kneaddata_human_hg38_refMrna_bowtie2_paired_clean_1.fastq )	24801812
decontaminated SILVA_128_LSUParc_SSUParc_ribosomal_RNA pair1	Total reads after removing those found in reference database (meta_test_R1_kneaddata_SILVA_128_LSUParc_SSUParc_ribosomal_RNA_bowtie2_paired_clean_1.fastq )	840881
final pair1	Total reads after merging results from multiple databases (meta_test_R1_kneaddata_paired_1.fastq )	840505

I used a total of two databases, which are shown below:
06/20/2023 02:23:37 PM - kneaddata.knead_data - INFO: Running kneaddata v0.10.0
06/20/2023 02:23:37 PM - kneaddata.knead_data - INFO: Output files will be written to: /data/test/bstest/metatrans_test/kneaddata
06/20/2023 02:23:37 PM - kneaddata.knead_data - DEBUG: Running with the following arguments:
verbose = False
input = /data/test/bstest/metatrans_test/meta_test_R1.fastq.gz /data/test/bstest/metatrans_test/meta_test_R2.fastq.gz
output_dir = /data/test/bstest/metatrans_test/kneaddata
reference_db = /data/References/bowtie_human_transcriptome/human_hg38_refMrna /data/References/kneaddata_db_ribosomal_RNA/SILVA_128_LSUParc_SSUParc_ribosomal_RNA

Topic		Replies	Views
Questions about the read count table pulled from kneaddata logs KneadData	1	556	February 8, 2023
Total reads after trimming is the SAME as Initial number of reads KneadData	0	278	January 5, 2023
Number of Reads Metagenomic Data for Maaslin3 MaAsLin	6	81	May 15, 2025
Massive difference between paired reads' counts KneadData	1	631	May 1, 2021
Higher number of reads after trimmed + contaminated step cf. raw reads? KneadData	1	552	July 10, 2020

Kneaddata final pair1 : Total reads after merging results from multiple databases

Related topics