Hi,
I would like to ask a few questions regarding the execution of the tool:
1- First, I am using paired-end reads (L1 and L2 fastq.gz files) for each sample. But for each patient (patient ID) I have multiple samples, as I have one sample for each timepoint. E.g: 100-3mo-L1.fq.gz, 100-3mo-L2.fq.gz, 100-5mo-L1.fq.gz, 100-5mo-L2.fq.gz…etc.
I know that, even that Metaphlan does not use paired-end information, we can still feed the input with both paired-end files. However, is it possible to feed the input with multiple paired-end files (multiple samples)? I am trying to run the following code:
metaphlan {subset_reads_dir}/mo3_100_EKDN34700-1A_{sample_id}L1_1.fq.gz,
{subset_reads_dir}/mo3_100_EKDN34700-1A{sample_id}_L1_2.fq.gz
{subset_reads_dir}/mo5_100_EKDN34701-1A_HGD53_L1_1.fq.gz,
{subset_reads_dir}/mo5_100_EKDN34701-1A_HGD53_L1_2.fq.gz
–bowtie2out metagenome.bowtie2.bz2 --bowtie2db {metaphlan_db_dir} -x
mpa_v31_CHOCOPhlAn_201901 -t rel_ab_w_read_stats --nproc 24
–input_type fastq > {metaphlan_profile_dir}/{sample_id}.profile.txt;
Unfortunately, I am getting an error stating that it is receiving {subset_reads_dir}/mo5_100_EKDN34701-1A_HGD53_L1_1.fq.gz,\ and
{subset_reads_dir}/mo5_100_EKDN34701-1A_HGD53_L1_2.fq.gz\ as unexpected arguments. When I remove these files from the input, however, and I only use the two mo3 files (L1 and L2) as input, the run starts to work though. Is this normal?
2- Is it recommended to use 1 metaphlan run for all the samples of each patient (multiple paired-end files) and then merge the abundance output tables together? Or should we use only one metaphlan run for each single pair-end files (one L1 file and one L2 file)?
Many thanks in advance.
Best,
Michael