The bioBakery help forum

Query regarding HMP2 Metatranscriptomics raw files

Hello Sir,
I am doing reanalysis of your metatranscriptomics data. In raw metatranscriptomics files which were passed through quality control pipeline using #infrastructure-and-utilities:kneaddata Kneaddata, The output is concatenated and I want to seperate forward read (#0/1)and reverse read(#0/2). I could not understand how to identify some reads as they only had #0 at the end of the read ID.

Also please tell me if my consideration of forward read and reverse read ID is correct or not
Thank you

Hi Pratik,

Yes, you are right about the forward read and reverse read ID. Can you point me out (link) to the raw metatranscriptomics files which have #0 at the end?


Hello Sir,
Thank you for reply, I can tell you one of the sample ID of metatranscriptomics sample, such as CSM7KOUN

Hi @PRATIK_SHINDE and @sagunmaharjann,
I also downloaded these data from idbmdb. I understand the fastq files are interleaved paired end. Did you managed to understand what are the #0 reads?
Additionally, I’m wondering which preprocessing steps the reads were subjected to.

Hello @Ish
The preprocessing steps used on the raw reads are mentioned on Biobakery GitHub website biobakery_workflows · biobakery/biobakery Wiki · GitHub

It’s mentioned in detail here.
As for the #0 reads I am still not sure what they are and waiting for reply from @sagunmaharjann

Just making sure- the preprocessing is detailed under * [ 2.3.1 Quality control data ] (and actually in kneadata? (biobakery_workflows · biobakery/biobakery Wiki · GitHub)

Hello @Ish ,
You can check the respective sample files log file, available at ibdmdb for detailed preprocessing and for commands detail check kneadata manual

1 Like

Hi @sagunmaharjann ,
Will you please reply to my query, it has been 3 months since we started the conversation.
thank you


Apologies for the delay. I compared and tested a couple of MTX samples that you pointed out and the #0 are the forward reads “#0/1”. The old version of Kneaddata was somehow removing the “/1” from the forward reads during the merge.


Hi @sagunmaharjann,
Thanks for reply, so does that mean almost 80% of the reads are forward reads only. That’s the same issue with other sample files.