I am tyring to run Metaphlan so I can run Strainphlan but I am having issues. I did a clean install of metaphlan through conda (running version 4.0.6) and I am following the tutorial for strainphlan. However when I attempt to run the code:
for f1 in fastq/*_R1.fastq.gz; do
# Get the corresponding R2 file
f2="${f1/_R1/_R2}"
# Check if the R2 file exists for the current R1 file
if [ -e "$f2" ]; then
echo "Running MetaPhlAn on ${f1} and ${f2}"
bn1=$(basename "${f1}")
bn2=$(basename "${f2}")
# Remove the _R1 or _R2 suffix from the base names
bn1_no_suffix="${bn1/_R1/}"
bn2_no_suffix="${bn2/_R2/}"
metaphlan "${f1}","${f2}" --bowtie2out "bowtie2/${bn1_no_suffix}.bowtie2.bz2" -s "sams/${bn1_no_suffix}.sam.bz2" -o "profiles/${bn1_no_suffix}_profiled.tsv" --input_type fastq --bowtie2db ../metaphlan_db --nproc 8
else
echo "Error: Paired-end file not found for ${f1}"
fi
done
I get this error and no profile file is generated, however the bowtie2 and sam files are created.
Traceback (most recent call last):
File "/kusers/ancillary/anaconda3/envs/mpa/bin/read_fastx.py", line 10, in <module>
sys.exit(main())
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/site-packages/metaphlan/utils/read_fastx.py", line 168, in main
f_nreads, f_avg_read_length = read_and_write_raw(f, opened=False, min_len=min_len, prefix_id=prefix_id)
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/site-packages/metaphlan/utils/read_fastx.py", line 130, in read_and_write_raw
nreads, avg_read_length = read_and_write_raw_int(inf, min_len=min_len, prefix_id=prefix_id)
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/site-packages/metaphlan/utils/read_fastx.py", line 70, in read_and_write_raw_int
l = fd.readline()
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/gzip.py", line 314, in read1
return self._buffer.read1(size)
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/gzip.py", line 488, in read
if not self._read_gzip_header():
File "/kusers/ancillary/anaconda3/envs/mpa/lib/python3.10/gzip.py", line 436, in _read_gzip_header
raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'#m')