Error - invalid start byte - Metaphlan3

Hello,
We have been processing through the steps in metaphlan 3 and most of our samples produce results that we understand. But sometimes we get an error stream (in bowtie) that looks like this (see below). It ends in the traceback with an invalid start byte error. There wasn’t a corresponding error encountered when we ran through the kneaddata steps. And a visual inspection of the kneaddata files doesn’t raise any red flags.

  1. is this at all familiar?
  2. what do we need to do to avoid this error in our processing of metaphlan 3?

Thanks so much
Scot Zens
— start of Error message -----------------------------------------------------------------
Use of uninitialized value $bt2_args[2] in join or string at /HOME/miniconda3/bin/bowtie2 line 423.
Use of uninitialized value bt2_args[3] in join or string at /HOME/miniconda3/bin/bowtie2 line 423. Use of uninitialized value [2] in string eq at /HOME/miniconda3/bin/bowtie2 line 360.
Use of uninitialized value $
[3] in string eq at /HOME/miniconda3/bin/bowtie2 line 360.
Use of uninitialized value in exists at /HOME/miniconda3/bin/bowtie2 line 81.
Use of uninitialized value in exists at /HOME/miniconda3/bin/bowtie2 line 81.
Use of uninitialized value $bt2_args[2] in join or string at /HOME/miniconda3/bin/bowtie2 line 459.
Use of uninitialized value $bt2_args[3] in join or string at /HOME/miniconda3/bin/bowtie2 line 459.
Traceback (most recent call last):
File “/HOME/miniconda3/bin/read_fastx.py”, line 10, in
sys.exit(main())
File “/HOME/miniconda3/lib/python3.8/site-packages/metaphlan/utils/read_fastx.py”, line 155, in main
nreads += read_and_write_raw(f, opened=False, min_len=min_len)
File “/HOME/miniconda3/lib/python3.8/site-packages/metaphlan/utils/read_fastx.py”, line 119, in read_and_write_raw
nreads = read_and_write_raw_int(inf, min_len=min_len)
File “/HOME/miniconda3/lib/python3.8/site-packages/metaphlan/utils/read_fastx.py”, line 64, in read_and_write_raw_int
l = fd.readline()
File “/HOME/miniconda3/lib/python3.8/codecs.py”, line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x8b in position 1: invalid start byte
— end of Error message -----------------------------------------------------------------

About the “Use of uninitialized value” warnings, see [Warnings] MetaPhlAn version 3.0 · Issue #101 · biobakery/MetaPhlAn (github.com), it does not affect the analysis.

For the UnicodeDecodeError, can you paste here the command line you used?

I also had this problem.I was trying to run the command
metaphlan ${readall} --bowtie2out ${bowres} --nproc 5 --input_type fastq -o ${res} --bowtie2db /beegfs/home/syl/database_db/metaphlan_database
but it throws the following output:

Traceback (most recent call last):
  File "/beegfs/home/syl/anaconda3/envs/kofamscan/bin/read_fastx.py", line 8, in <module>
    sys.exit(main())
  File "/beegfs/home/syl/anaconda3/envs/kofamscan/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py", line 168, in main
    f_nreads, f_avg_read_length = read_and_write_raw(f, opened=False, min_len=min_len, prefix_id=prefix_id)
  File "/beegfs/home/syl/anaconda3/envs/kofamscan/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py", line 130, in read_and_write_raw
    nreads, avg_read_length = read_and_write_raw_int(inf, min_len=min_len, prefix_id=prefix_id)
  File "/beegfs/home/syl/anaconda3/envs/kofamscan/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py", line 70, in read_and_write_raw_int
    l = fd.readline()
  File "/beegfs/home/syl/anaconda3/envs/kofamscan/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Can you help me with this problem?. Because I don’t know if it’s problem of instalation or the server from which the data is beeing requested.

Thanks for your help!

1 Like

same issue but with different command
$ sample2markers.py -i bam.bz2 -o ../strainphlan/ -d ../mpa/

I got an error message with the following

Tue Aug 15 13:14:06 2023: Start samples to markers execution
Tue Aug 15 13:14:06 2023: Creating temporary directory...
Tue Aug 15 13:14:06 2023: Done.
Tue Aug 15 13:14:06 2023: Filtering SAM files...
Tue Aug 15 13:14:06 2023: [Error] Parallel execution fails: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Tue Aug 15 13:14:06 2023: Stop StrainPhlAn execution.```

Hi @Day
It seems an error with the input format, is it in compressed BAM or SAM format? If the input is a bam file it cannot be compressed and you must specify the parameter --input_format bam . If it is in sam format but NOT compressed, it should be --input_format sam.