Strainphlan.py error

Hello,

I am testing StrainPhlAn under conda biobakery worflows environment (CentOS 7). While following steps described in https://bitbucket.org/biobakery/biobakery/wiki/strainphlan web page, I came across an error:

$ strainphlan.py --ifn_samples *.markers --output_dir . --print_clades_only > clades.txt --use_threads
2020-03-02 13:49:06,413 | INFO | main | strainer | 1339 | Load mpa_pkl
2020-03-02 13:49:26,997 | INFO | main | strainer | 1355 | Get clades from db
2020-03-02 13:49:31,723 | INFO | main | strainer | 1399 | Get clades from samples
2020-03-02 13:49:31,723 | DEBUG | main | load_sample | 1142 | load 13530241_SF05.markers
Traceback (most recent call last):
File “/opt/anaconda2/envs/biobakery/bin/strainphlan.py”, line 1556, in
strainphlan()
File “/opt/anaconda2/envs/biobakery/bin/strainphlan.py”, line 1552, in strainphlan
strainer(args)
File “/opt/anaconda2/envs/biobakery/bin/strainphlan.py”, line 1403, in strainer
kept_markers=kept_markers)
File “/opt/anaconda2/envs/biobakery/bin/strainphlan.py”, line 1247, in load_all_samples
use_threads=args[‘use_threads’])
File “/opt/anaconda2/envs/biobakery/bin/ooSubprocess.py”, line 262, in parallelize
results = serialize(func, args)
File “/opt/anaconda2/envs/biobakery/bin/ooSubprocess.py”, line 286, in serialize
results.append(func(arg))
File “/opt/anaconda2/envs/biobakery/bin/ooSubprocess.py”, line 248, in wrapper
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File “/opt/anaconda2/envs/biobakery/bin/ooSubprocess.py”, line 244, in wrapper
return f(*args, **kwargs)
File “/opt/anaconda2/envs/biobakery/bin/strainphlan.py”, line 1154, in load_sample
marker2seq = msgpack.load(ifile, use_list=False)
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/msgpack/init.py”, line 46, in unpack
return unpackb(data, **kwargs)
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/msgpack/fallback.py”, line 129, in unpackb
ret = unpacker._unpack()
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/msgpack/fallback.py”, line 670, in _unpack
ret[key] = self._unpack(EX_CONSTRUCT)
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/msgpack/fallback.py”, line 670, in _unpack
ret[key] = self._unpack(EX_CONSTRUCT)
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/msgpack/fallback.py”, line 666, in _unpack
“%s is not allowed for map key” % str(type(key))
ValueError: <type ‘int’> is not allowed for map key

I also tried biobakery_workflows wmgx using demo dataset. With --bypass-strain-profiling, there was no problem. But if strain profiling was enabled, the same error occurred again at the strainphlan.py step. This is the anadama.log:

2020-03-02 11:39:18,547 LoggerReporter task_command INFO: Executing with shell: strainphlan.py --ifn_samples /nas/project/MT21_bioBakery_workflow/out/strainphlan/.markers --output_dir /nas/project/MT21_bioBakery_workflow/out/strainphlan --print_clades_only > /nas/project/MT21_bioBakery_workflow/out/strainphlan/clades_list.txt
2020-03-02 11:39:44,762 LoggerReporter task_failed ERROR: task 71, strainphlan_print_clades : Failed! Error message : Error executing action 0. Original Exception:
Traceback (most recent call last):
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/anadama2/runners.py”, line 201, in _run_task_locally
action_func(task)
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/anadama2/helpers.py”, line 84, in actually_sh
ret = _sh(s, **kwargs)
File “/opt/anaconda2/envs/biobakery/lib/python2.7/site-packages/anadama2/util/init.py”, line 320, in sh
raise ShellException(proc.returncode, msg.format(cmd, ret[0], ret[1]))
ShellException: [Errno 1] Command `strainphlan.py --ifn_samples /nas/project/MT21_bioBakery_workflow/out/strainphlan/
.markers --output_dir /nas/project/MT21_bioBakery_workflow/out/strainphlan --print_clades_only > /nas/project/MT21_bioBakery_workflow/out/strainphlan/clades_list.txt’ failed.
Out:
Err: 2020-03-02 11:39:19,015 | INFO | main | strainer | 1339 | Load mpa_pkl
2020-03-02 11:39:39,631 | INFO | main | strainer | 1355 | Get clades from db
2020-03-02 11:39:44,521 | INFO | main | strainer | 1399 | Get clades from samples
2020-03-02 11:39:44,521 | DEBUG | main | load_sample | 1142 | load /nas/project/MT21_bioBakery_workflow/out/strainphlan/HD32R1_subsample_bowtie2.markers
Traceback (most recent call last):

How can I fix this? Samtools and bcftools 0.1.19-44428cd were used.

Hi hyjeong,
It looks that you are experimenting a problem in the markers reconstruction step. Could you share us one of the markers files?

Best,
Aitor

Dear Aitor,

Thanks for your suggestion. But I coud not upload *makers file into the message because I am a new user. When six *marker files downloaded from the strainplan tutorial site (https://bitbucket.org/biobakery/biobakery/wiki/strainphlan), I’ve got the same error messages. Maybe it could be a configuration problem.

Amazon Web Service EC2 AMI (bioBakery v1.1), however, worked correctly!

Regards,
Haeyoung