Metaphlan 3 runs, but does not produce output files

Hi biob help team,

I am getting a weird issue that I am not sure how to solve. I have conda installed metaphlan3. The tool appears to run like normal (takes time to process, produces bowtie intermediate files), but there is no output file created. I am also unable to get the taxonomic info in stout if I don’t specify the -o flag.

Here is the command:
metaphlan /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq.bowtie2out.txt --input_type fastq --add_viruses -o /home/ewissel/proj_angst/check_proc_qual/meta3_out/E0683-1-VagM1_meta3_out.txt --nproc 3 --bowtie2db /home/kjijakl/biobakery_db/metaphlan_db/ --force

Other users on our server who are using the same conda-install version of metaphlan do not have this issue, and there are no obvious differences in how we are running the command. Note that the pipeline also does not produce the output directory meta3_out and I have all permissions to read/write files that are in use here . Any clues?

Hi @ewissel
By what you are reporting, I think your problem is due to the missing /home/ewissel/proj_angst/check_proc_qual/meta3_out directory. MetaPhlAn will expect that directory already exists and it won’t create it if it is not present. Try to manually create the output directory first and then run metaphlan.

Thanks for the suggestion!

When I manually create the directory and try to run metaphlan, I still don’t get any output. The bowtie2out files are created, and WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant. An additional column listing the merged species is added to the MetaPhlAn output. is spit in the console, but no *_profile.txt files are created.

I was rechecking the command you are using and I think I found out the problem.

metaphlan /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq.bowtie2out.txt --input_type fastq --add_viruses -o /home/ewissel/proj_angst/check_proc_qual/meta3_out/E0683-1-VagM1_meta3_out.txt --nproc 3 --bowtie2db /home/kjijakl/biobakery_db/metaphlan_db/ --force

You are specifying the *.bowtie2out.txt file as the second argument instead as the --bowtie2out parameter. As a second argument is being specified, metaphlan is storing the results there instead. Try changing the command to:
metaphlan /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq –bowtie2out /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq.bowtie2out.txt --input_type fastq --add_viruses -o /home/ewissel/proj_angst/check_proc_qual/meta3_out/E0683-1-VagM1_meta3_out.txt --nproc 3 --bowtie2db /home/kjijakl/biobakery_db/metaphlan_db/ --force

Thanks! I fixed the command and still had the same issue. Upon closer inspection, I found that the metaphlan3 output was overriding the R2 fastq with the results. I’m not sure why because I always make sure I specify the output name to specifically avoid this type of problem. I have the fastqs backed up so it’s not an issue for me, but I’m not sure how this happened!

Hi @ewissel
In the command you shared you only runned metaphlan on the R1. How is the command adding also R2? Is it like:
metaphlan /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_2.fastq –bowtie2out /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1.fastq.bowtie2out.txt --input_type fastq --add_viruses -o /home/ewissel/proj_angst/check_proc_qual/meta3_out/E0683-1-VagM1_meta3_out.txt --nproc 3 --bowtie2db /home/kjijakl/biobakery_db/metaphlan_db/ --force
In that case, you should change it to:
metaphlan /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_1.fastq,/home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1_2.fastq –bowtie2out /home/ewissel/proj_angst/check_proc_qual/inputs/E0683-1-VagM1.fastq.bowtie2out.txt --input_type fastq --add_viruses -o /home/ewissel/proj_angst/check_proc_qual/meta3_out/E0683-1-VagM1_meta3_out.txt --nproc 3 --bowtie2db /home/kjijakl/biobakery_db/metaphlan_db/ --force
For the same reason as with the bowtie2out, the R1 and R2 should be separated by comma or the second argument will be detected as the output file

2 Likes

Ah I see what the error is. I didn’t use a comma to separate R1 and R2, I used a space so it thought the R2 file was the output. Good to know!! Did I miss this in the documentation?

You might have missed it: MetaPhlAn 3.0 · biobakery/MetaPhlAn Wiki · GitHub

MetaPhlAn can also natively handle paired-end metagenomes (but does not use the paired-end information), and, more generally, metagenomes stored in multiple files (but you need to specify the --bowtie2out parameter):
$ metaphlan metagenome_1.fastq,metagenome_2.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq -o profiled_metagenome.txt

thank you for linking, i see it now!