Hi Biobakery team,
I am using KneadData for processing dual-transcriptome RNA-seq data (host transcriptome and microbial metatranscriptome). My RNA-seq data were from paired-end (PE) sequencing run, thus for each sample, I have two fastq files, forward and reverse direction. In addition, both host and microbial sequences were integrated together in the same fastq file. Thus, I am wondering what the best way for processing such integrated data, and only extract microbial part?
I am using the following command, I have two additional questions:
kneaddata --input sample1.R1.fastq --input sample1.R2.fastq -db human_rna_db --output seq_out.
In this command, I followed the user manual to include two fastq input options, one for forward direction, the other for the reverse direction, I am wondering if these are same thing as the first and second mate as described in the user manual? If not, how should I include forward and reverse fastq files?
As shown in the command, I only used the human transcriptome database option. Since I only set up human_rna_db, does it mean the reads does not belong to human_rna_db is the bacterial reads? i.e., I should use (c) and (d) from the following description? Or is there a way that I can separate human and microbial sequences from the same fastq file?
Files for just the user manual human_rna_db database:
(a). seq_kneaddata_paired_human_rna_db_bowtie2_contam_1.fastq: Reads from the first mate in situation (1) above that were identified as belonging to the human_rna_db database.
(b). seq_kneaddata_paired_human_rna_db_bowtie2_contam_2.fastq: Reads from the second mate in situation (1) above that were identified as belonging to the human_rna_db database.
(c). seq_kneaddata_paired_human_rna_db_bowtie2_clean_1.fastq: Reads from the first mate in situation (1) above that were identified as NOT belonging to the human_rna_db database.
(d). seq_kneaddata_paired_human_rna_db_bowtie2_clean_2.fastq: Reads from the second mate in situation (1) above that were identified as NOT belonging to the human_rna_db database.
Thank you so much!