MetaPhlAn 4 for paired end reads of multiple samples

Hi,
I am new to the MetaPhlan microbial community profiling. I have dataset of shotgun metagenomic raw reads for 62 samples, i.e., 124 paired end reads. I would like to run all the samples together in MetaPhlan 4. I will run it on remote server. Please help me in this regard.

Hi @anbu_arun
Please, have a look to the MetaPhlAn 4 tutorial MetaPhlAn 4 · biobakery/MetaPhlAn Wiki · GitHub and documentation MetaPhlAn 4 · biobakery/MetaPhlAn Wiki · GitHub

Hi

Does anyone know how to loop this for paired end reads, I can run them using

for i in *.fastq.gz ; do metaphlan $i --input_type fastq --nproc 14 -s sams/${i%}.sam.bz2 --bowtie2out bowtie2/${i%}.bowtie2.bz2 -o profiles/${i%}_profiled.tsv ; done

but this does not take into account that I have paired-end reads, this treats each fastq.gz file as single sample.

If your paired-end files are named in a “_1.fastq.gz” “_2.fastq.gz” like way, you could run the following:

for i in *_1.fastq.gz ; do metaphlan $i,${i/_1.fastq.bz2/_2.fastq.bz2} --input_type fastq --nproc 14 -s sams/${i%}.sam.bz2 --bowtie2out bowtie2/${i%}.bowtie2.bz2 -o profiles/${i%}_profiled.tsv ; done

1 Like

many thanks - Julian M