Kneaddata output

balamurugan_Sadaiapp · January 19, 2022, 5:34am

Hi
I am analyzing my shot-gun metagenome sample in kneaddata. I gave the command “kneaddata --input AST2R1.fastq --input AST2R1.fastq -db $/home/plankton/Kneadata_DIR --output /home/plankton/Metagenomics_AST_Thatha/CG_DN_935 --trimmomatic /home/plankton/anaconda3/share/trimmomatic-0.39-2 --cat-final-output” and got output as

Final output files created:
/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata_paired_1.fastq
/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata_paired_2.fastq
/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata_unmatched_1.fastq
/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata_unmatched_2.fastq
/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata.fastq

So which file I have to use for the next step(Humann3)

Thanks for your valuable time

sagunmaharjann · January 28, 2022, 4:44pm

Hi @balamurugan_Sadaiapp ,

Thank you for reaching out to the biobakery Lab.

The merged file of the following two output files would be the input for the Humann3 step.

/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata_paired_1.fastq
/home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/AST2R1_kneaddata_paired_2.fastq

Regards,
Sagun

balamurugan_Sadaiapp · January 29, 2022, 1:21pm

HI Sagun
Based on the Kneaddata outputs discussion, I merged the
AST6_R1_kneaddata.repeats.removed.1.fastq and AST6_R1_kneaddata.repeats.removed.1.fastq and used for humann3.
Please clarify, this is correct.

Thank you for your reply

jorondo1 · March 31, 2022, 2:54pm

Hi @balamurugan_Sadaiapp
Personally I did not use the output concatenation function from humann simply because I didn’t know if the original files would be kept. I simply added cat *paired_1.fastq *paired_2.fastq > cat-paired.fastq at the end of my script. But from what I see in your file list both files (concatenad and not concatenated) are being kept.

In any case; if you provided a reference genome for decontamination, the files of interest are *_paired_?.fastq. If not, they will end in *repeats.removed.?.fastq, because that is how the files are named after running through the tandem repeat finder (it’s the last step before decontamination).

TL;DR : if you did provide a reference genome for decontamination, you want to merge the *paired_?.fastq files, which --cat-final-output does and seemingly calls the output AST2R1_kneaddata.fastq in your case.

You could simply run this command ls -lha /home/plankton/Metagenomics_AST_Thatha/CG_DN_935/AST2.1/* to see if the file size is double the paired files.

Hope this helps!
cheers

balamurugan_Sadaiapp · May 13, 2022, 10:08am

Thank you Jorondo1 for the detailed clarification.
Thanks

Topic		Replies	Views
Size of the paired Kneaddata output file is 0 KneadData	3	76	October 23, 2024
Interpreting file names and sizes KneadData	0	278	December 17, 2021
Can I run Kneaddata with catenated forward and reverse reads file? KneadData	5	1048	December 24, 2020
ERROR kneaddata output files KneadData	0	164	February 12, 2024
No reads in the "Final output files created" KneadData	2	966	April 14, 2022

Kneaddata output

Related topics