I am getting intermittent failures running metaphlan 3.1. I suspect there is sometimes a problem with streaming the data from read_fastx to the bowtie2 process.
In metaphlan.py, run_bowtie2():
read the fastx file …
readin = subp.Popen([read_fastx, ‘-l’, str(read_min_len), fna_in], stdout=subp.PIPE, stderr=subp.PIPE)
…
p = subp.Popen(bowtie2_cmd, stdout=subp.PIPE, stdin=readin.stdout)
readin.stdout.close()
The error is in writing from read_fastx (the error message is wrong).
Error: reads file does not look like a FASTQ file
terminate called after throwing an instance of ‘int’
(ERR): bowtie2-align died with signal 6 (ABRT) (core dumped)
Traceback (most recent call last):
- File “/home/cfjell/.conda/envs/metaphlanenv/bin/read_fastx.py”, line 10, in *
- sys.exit(main())*
- File “/home/cfjell/.conda/envs/metaphlanenv/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py”, line 167, in main*
- f_nreads, f_avg_read_length = read_and_write_raw(f, opened=False, min_len=min_len, prefix_id=prefix_id)*
- File “/home/cfjell/.conda/envs/metaphlanenv/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py”, line 129, in read_and_write_raw*
- nreads, avg_read_length = read_and_write_raw_int(inf, min_len=min_len, prefix_id=prefix_id)*
- File “/home/cfjell/.conda/envs/metaphlanenv/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py”, line 108, in read_and_write_raw_int*
- print_record(description + “__{}{}{}”.format(prefix_id, ‘.’ if prefix_id else ‘’, idx), sequence, qual, fmt))*
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name=‘’ mode=‘w’ encoding=‘UTF-8’>
BrokenPipeError: [Errno 32] Broken pipe
We are running an HPC nextflow+slurm system on RHEL.
Anyone else seen this? It’s frustratingly difficult to reliably reproduce the conditions – might be load on the server or something.
Since we are using nextflow for workflow management, I’m interested in pulling out this problematic (maybe?) piece of pipe communication and using only the bowtie2 mapping inputs “–input_type bowtie2out”. (We can better manage bowtie2 load at the nextflow management level).
Sorry if I’ve posted to the wrong forum, we are happy to do provide a pull request if we can resolve the issue.
Cheers,
Chris Fjell
BC CDC