I am new to shotgun metagenomics and I want to use your bioBakery toolbox, starting with kneadData, MetaPhlAn2 and HUMAnN2, and automate the analysis of shotgun data. I’m questioning myself about the best way to handle paired-end sequencing data.
Correct me if I’m wrong, but from what I understood, there is no benefit in providing paired-end files as MetaPhlAn2 and HUMANnN2 will basically use them like if they were two single-end files.
With that in mind, I am thinking about concatenating the forward and reverse files in a single file before performing any analysis. That way, I would have a similar workflow no matter if my data is single-end or paired-end, and it would be easier to handle it technically speaking (no paired.1, paired.2, single.1, single.2 files to deal with). Is there any drawback to this approach ?
Some other questions (actually related to the yes/no answer to my latter question) :
- When dealing with overlapping paired-end reads, do you merge them at the beginning of the process ?
- Is there any bioBakery tool that uses the paired-end info ?