Strange output from paired end kneaddata input

DEEPCHANDA7 · August 27, 2020, 10:51pm

Hi @lauren.j.mciver
I have run kneaddata with paired-end data but i am getting some strange output. I have a forward and a reverse read file each of size 780MB. But when I run kneaddata it gives two output files of size 1.3 GB and 148MB. One file increased in size drastically and another decreased drastically.
I have checked the read counts of two input and catenated output file. I have seen that total read counts in the catenated output file is only few thousand less (among millions) than total read counts from forward and reverse files.
So, should I depend on the data and catenate the outputs followed by HUMAnN analysis? or, do not rely into the outputs?

Thanks,
DC7

lauren.j.mciver · August 28, 2020, 7:27pm

Hi DC7, I think it might be best to double check what might be up with the kneaddata runs and get the outputs to look as expected before continuing on to running HUMAnN. From what you describe I think kneaddata is not tracking the read pairs correctly. This is usually due to a sequence identifier (the first line in a read set in fastq format) being of an unexpected format (eg having spaces or not having a pair identifier). Would you double check the format of the sequence identifiers (just check the first line of a few of the fastq input files) and see if it might fall into any of the cases of an unexpected format?

Thank you,
Lauren

DEEPCHANDA7 · August 28, 2020, 7:35pm

please refer to this query

many thanks

Topic		Replies	Views
Paired-end data results in unpaired output KneadData	27	5823	June 20, 2024
Size of the paired Kneaddata output file is 0 KneadData	3	73	October 23, 2024
Kneaddata outputs KneadData	1	1148	July 9, 2021
All paired-end read unmatched KneadData	34	5683	December 25, 2024
Can I run Kneaddata with catenated forward and reverse reads file? KneadData	5	1047	December 24, 2020

Strange output from paired end kneaddata input

Related topics