Query regarding HMP2 Metatranscriptomics raw files

PRATIK_SHINDE · January 27, 2021, 1:07pm

Hello Sir,
I am doing reanalysis of your metatranscriptomics data. In raw metatranscriptomics files which were passed through quality control pipeline using #infrastructure-and-utilities:kneaddata Kneaddata, The output is concatenated and I want to seperate forward read (#0/1)and reverse read(#0/2). I could not understand how to identify some reads as they only had #0 at the end of the read ID.

Also please tell me if my consideration of forward read and reverse read ID is correct or not
Thank you

sagunmaharjann · January 29, 2021, 4:09pm

Hi Pratik,

Yes, you are right about the forward read and reverse read ID. Can you point me out (link) to the raw metatranscriptomics files which have #0 at the end?

Thanks

PRATIK_SHINDE · February 5, 2021, 2:42pm

Hello Sir,
Thank you for reply, I can tell you one of the sample ID of metatranscriptomics sample, such as CSM7KOUN

Ish · April 1, 2021, 9:01pm

Hi @PRATIK_SHINDE and @sagunmaharjann,
I also downloaded these data from idbmdb. I understand the fastq files are interleaved paired end. Did you managed to understand what are the #0 reads?
Additionally, I’m wondering which preprocessing steps the reads were subjected to.
Thanks

PRATIK_SHINDE · April 2, 2021, 6:54am

Hello @Ish
The preprocessing steps used on the raw reads are mentioned on Biobakery GitHub website biobakery_workflows · biobakery/biobakery Wiki · GitHub

It’s mentioned in detail here.
As for the #0 reads I am still not sure what they are and waiting for reply from @sagunmaharjann

Ish · April 5, 2021, 1:39pm

Thanks @PRATIK_SHINDE
Just making sure- the preprocessing is detailed under * [ 2.3.1 Quality control data ] (and actually in kneadata? (biobakery_workflows · biobakery/biobakery Wiki · GitHub)

PRATIK_SHINDE · April 5, 2021, 2:01pm

Hello @Ish ,
You can check the respective sample files log file, available at ibdmdb for detailed preprocessing and for commands detail check kneadata manual

PRATIK_SHINDE · May 4, 2021, 12:47pm

Hi @sagunmaharjann ,
Will you please reply to my query, it has been 3 months since we started the conversation.
thank you

sagunmaharjann · May 4, 2021, 5:29pm

Hi @PRATIK_SHINDE ,

Apologies for the delay. I compared and tested a couple of MTX samples that you pointed out and the #0 are the forward reads “#0/1”. The old version of Kneaddata was somehow removing the “/1” from the forward reads during the merge.

Regards,
Sagun

PRATIK_SHINDE · May 4, 2021, 5:57pm

Hi @sagunmaharjann,
Thanks for reply, so does that mean almost 80% of the reads are forward reads only. That’s the same issue with other sample files.

Topic		Replies	Views
Strange output for kneaddata KneadData	3	1275	October 17, 2022
KneadData for dual-transcriptome RNA-seq data KneadData	1	695	June 3, 2021
There are less reads survived after kneaddata KneadData	5	1155	January 23, 2023
Kneaddata adds "#0/1" to barcode in the header KneadData	0	370	May 4, 2022
Questions about the read count table pulled from kneaddata logs KneadData	1	595	February 8, 2023

Query regarding HMP2 Metatranscriptomics raw files

Related topics