Dear PhyloPhlAn team and community,
Having run this command in PhyloPhlAn (version 3.0.60 (27 November 2020), installed through conda):
(myenv) usr@srvr:~$ phylophlan -i /home/usr/data/input/11assemblies -d /home/usr/phylophlan_databases phylophlan_databases --diversity medium -f supermatrix_nt.cfg --nproc 8
…resulted in the following:
Traceback (most recent call last):
File “/home/usr/miniconda3/envs/myenv/bin/phylophlan”, line 10, in
sys.exit(phylophlan_main())
File “/home/usr/miniconda3/envs/myenv/lib/python3.7/site-packages/phylophlan/phylophlan.py”, line 3227, in phylophlan_main
verbose=args.verbose)
File “/home/usr/miniconda3/envs/myenv/lib/python3.7/site-packages/phylophlan/phylophlan.py”, line 818, in init_database
for f in glob.iglob(os.path.join(folder, ‘*’))
File “/home/usr/miniconda3/envs/myenv/lib/python3.7/site-packages/phylophlan/phylophlan.py”, line 819, in
for _, seq in SimpleFastaParser(bz2.open(f, ‘rt’) if f.endswith(’.bz2’) else open(f))])
File “/home/usr/miniconda3/envs/myenv/lib/python3.7/site-packages/Bio/SeqIO/FastaIO.py”, line 47, in SimpleFastaParser
for line in handle:
File “/home/usr/miniconda3/envs/myenv/lib/python3.7/codecs.py”, line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xba in position 1035: invalid start byte
(myenv) usr@srvr:~$
I would greatly appreciate any feedback on how to fix this issue, so I could run PhyloPhlAn to construct a phylogeny of 11 whole-genome assemblies from pure cultured strains (no metagenomic data).
These assemblies are in this folder:
/home/usr/data/input/11assemblies
The database is in this folder:
/home/usr/phylophlan_databases
containing the following 2 files (i) and (ii), both manually downloaded via http://cmprod1.cibio.unitn.it/databases/PhyloPhlAn/phylophlan_databases.txt:
(i) phylophlan.tar
(downloaded from https://zenodo.org/record/4005620/files/phylophlan.tar?download=1)
(ii) phylophlan.md5
(downloaded from https://zenodo.org/record/4005620/files/phylophlan.md5?download=1)
Please note that I had to download the database manually (following suggestions found here: Using manually downloaded database · Issue #18 · biobakery/phylophlan · GitHub, “Using manually downloaded database #18”) because of limitations with the internet connection/firewall on my system.
Also, I have already come across this conversation: local variable 'input_faa_clean' referenced before assignment · Issue #9 · biobakery/phylophlan · GitHub (“local variable ‘input_faa_clean’ referenced before assignment #9”), suggesting that getting PhyloPhlAn directly from the repository would fix an issue which I guess is similar to mine (if not identical). Unfortunately, I can’t install PhyloPhlAn directly from the repository because of limitations with the internet connection/firewall on my system.
I would be very happy for any suggestions what I could do in order to get PhyloPhlAn running?
Thanks already in advance,
Michael