Kneaddata output as input for metaphlan

Jose_Freixas · September 24, 2025, 12:53pm

Hello, I appreciate anyone’s input on this. As the kneaddata output yields 4 major files (paired and unmatched) that I would like to use for downstream analysis (i.e., metaphlan), can I catenate them into one fastq file? I understand paired end reads come from the same DNA fragment, how would this approach (cat) avoid biases in counts or abundances?

tkuntz-hsph · September 30, 2025, 2:42pm

Hi,
Yes, you should cat your kneaddata output files into one fastq file for metaphlan. This is the standard procedure used in the biobakery workflows GitHub - biobakery/biobakery_workflows: bioBakery workflows is a collection of workflows and tasks for executing common microbial community analyses using standardized, validated tools and parameters.. Biases in counts and abundances are accounted for by how metaphlan calculates relative abundance, including information such as coverage and species genome length.

Topic		Replies	Views
Files from kneaddata output to use in metaphlan MetaPhlAn	1	626	March 31, 2022
Can I run Kneaddata with catenated forward and reverse reads file? KneadData	5	1087	December 24, 2020
What to do with unmatched paired-end reads from kneadata outputs? KneadData	1	481	August 3, 2023
Metaphlan error after processing fastq using kneaddata v0.12.0 KneadData	0	894	November 15, 2022
Kneaddata output KneadData	4	1025	May 13, 2022

Kneaddata output as input for metaphlan

Related topics