Kneaddata output as input for metaphlan

Hello, I appreciate anyone’s input on this. As the kneaddata output yields 4 major files (paired and unmatched) that I would like to use for downstream analysis (i.e., metaphlan), can I catenate them into one fastq file? I understand paired end reads come from the same DNA fragment, how would this approach (cat) avoid biases in counts or abundances?

Hi,
Yes, you should cat your kneaddata output files into one fastq file for metaphlan. This is the standard procedure used in the biobakery workflows GitHub - biobakery/biobakery_workflows: bioBakery workflows is a collection of workflows and tasks for executing common microbial community analyses using standardized, validated tools and parameters.. Biases in counts and abundances are accounted for by how metaphlan calculates relative abundance, including information such as coverage and species genome length.