Phylophlan mafft problem (implemented in strainphlan)

Hi I’m trying to run strainphlan 3. The consensus_markers and db_markers steps went fine but strainphlan threw an error. The problem is clearly with mafft (which I’ve confimed is installed but I’m not quite able to glean solid leads from the error. Any pointers would be greatly appreciated!

Installed with

conda create --name mpa -c bioconda metaphlan

Installations:

$ metaphlan -v
MetaPhlAn version 3.0 (20 Mar 2020)
$ phylophlan -v
PhyloPhlAn version 0.43 (2 March 2020)
$ mafft

---------------------------------------------------------------------

   MAFFT v7.464 (2020/Apr/21)

        MBE 30:772-780 (2013), NAR 30:3059-3066 (2002)
        https://mafft.cbrc.jp/alignment/software/
$ python --version
Python 3.7.6

Command (although I used a different species and reference):

strainphlan -s consensus_markers/*.pkl -m db_markers/s__Bacteroides_caccae.fna -r /ag1/ncbi/Bacteroides_caccae/fna/G000273725.fna.bz2 -o output -n 28 -c s__Bacteroides_caccae --phylophlan_mode accurate --mutation_rates

Error:

Tue May  5 01:37:21 2020: Start StrainPhlAn 3.0 execution
Tue May  5 01:37:21 2020: Creating temporary directory...
Tue May  5 01:37:21 2020: Done.
Tue May  5 01:37:21 2020: Getting markers from main sample files...
Tue May  5 01:37:22 2020: Done.
Tue May  5 01:37:22 2020: Getting markers from main reference files...Warning: [blastn] Examining 5 or more matches is recommended

Tue May  5 01:37:22 2020: Done.
Tue May  5 01:37:22 2020: Removing bad markers / samples...
Tue May  5 01:37:22 2020: Done.
Tue May  5 01:37:22 2020: Writing samples as markers' FASTA files...
Tue May  5 01:37:22 2020: Done.
Tue May  5 01:37:22 2020: Writing filtered clade markers as FASTA file...
Tue May  5 01:37:22 2020: Done.
Tue May  5 01:37:22 2020: Calculating polymorphic rates...
Tue May  5 01:37:22 2020: Done.
Tue May  5 01:37:22 2020: Executing PhyloPhlAn 3.0...
Tue May  5 01:37:22 2020:       Creating PhyloPhlAn 3.0 database...
Tue May  5 01:37:22 2020:       Done.
Tue May  5 01:37:22 2020:       Generating PhyloPhlAn 3.0 configuration file...
Tue May  5 01:37:23 2020:       Done.
Tue May  5 01:37:23 2020:       Processing samples...
[e] Command '['mafft', '--quiet', '--anysymbol', '--thread', '1', '--auto', 'output/./tmp/markers/100599946840.fna']' returned non-zero exit status 1.

[e] error while aligning
    command_line: mafft --quiet --anysymbol --thread 1 --auto output/./tmp/markers/100599946840.fna
           stdin: None
          stdout: /nnn2/testdrive/strainphlan/output/tmp/msas/100599946840.aln
             env: {'LESS_TERMCAP_mb': '\x1b[01;31m', 'HOSTNAME': 'ip-172-31-8-161', 'LESS_TERMCAP_md': '\x1b[01;38;5;208m', 'LESS_TERMCAP_me': '\x1b[0m', 'SHELL': '/bin/bash', 'TERM': 'screen', 'HISTSIZE': '1000', 'SSH_CLIENT': '99.124.158.179 49948 22', 'EC2_AMITOOL_HOME': '/opt/aws/amitools/ec2', 'CONDA_SHLVL': '2', 'CONDA_PROMPT_MODIFIER': '(mpa) ', 'PYTHON_INSTALL_LAYOUT': 'amzn', 'LESS_TERMCAP_ue': '\x1b[0m', 'SSH_TTY': '/dev/pts/0', 'USER': 'ec2-user', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:', 'EC2_HOME': '/opt/aws/apitools/ec2', 'CONDA_EXE': '/nnn1/bin/anaconda3/bin/conda', 'TERMCAP': 'SC|screen|VT 100/ANSI X3.64 virtual terminal:\\\n\t:DO=\\E[%dB:LE=\\E[%dD:RI=\\E[%dC:UP=\\E[%dA:bs:bt=\\E[Z:\\\n\t:cd=\\E[J:ce=\\E[K:cl=\\E[H\\E[J:cm=\\E[%i%d;%dH:ct=\\E[3g:\\\n\t:do=^J:nd=\\E[C:pt:rc=\\E8:rs=\\Ec:sc=\\E7:st=\\EH:up=\\EM:\\\n\t:le=^H:bl=^G:cr=^M:it#8:ho=\\E[H:nw=\\EE:ta=^I:is=\\E)0:\\\n\t:li#34:co#97:am:xn:xv:LP:sr=\\EM:al=\\E[L:AL=\\E[%dL:\\\n\t:cs=\\E[%i%d;%dr:dl=\\E[M:DL=\\E[%dM:dc=\\E[P:DC=\\E[%dP:\\\n\t:im=\\E[4h:ei=\\E[4l:mi:IC=\\E[%d@:ks=\\E[?1h\\E=:\\\n\t:ke=\\E[?1l\\E>:vi=\\E[?25l:ve=\\E[34h\\E[?25h:vs=\\E[34l:\\\n\t:ti=\\E[?1049h:te=\\E[?1049l:us=\\E[4m:ue=\\E[24m:so=\\E[3m:\\\n\t:se=\\E[23m:mb=\\E[5m:md=\\E[1m:mr=\\E[7m:me=\\E[m:ms:\\\n\t:Co#8:pa#64:AF=\\E[3%dm:AB=\\E[4%dm:op=\\E[39;49m:AX:\\\n\t:vb=\\Eg:G0:as=\\E(0:ae=\\E(B:\\\n\t:ac=\\140\\140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~..--++,,hhII00:\\\n\t:po=\\E[5i:pf=\\E[4i:k0=\\E[10~:k1=\\EOP:k2=\\EOQ:k3=\\EOR:\\\n\t:k4=\\EOS:k5=\\E[15~:k6=\\E[17~:k7=\\E[18~:k8=\\E[19~:\\\n\t:k9=\\E[20~:k;=\\E[21~:F1=\\E[23~:F2=\\E[24~:F3=\\E[1;2P:\\\n\t:F4=\\E[1;2Q:F5=\\E[1;2R:F6=\\E[1;2S:F7=\\E[15;2~:\\\n\t:F8=\\E[17;2~:F9=\\E[18;2~:FA=\\E[19;2~:kb=\x7f:K2=\\EOE:\\\n\t:kB=\\E[Z:kF=\\E[1;2B:kR=\\E[1;2A:*4=\\E[3;2~:*7=\\E[1;2F:\\\n\t:#2=\\E[1;2H:#3=\\E[2;2~:#4=\\E[1;2D:%c=\\E[6;2~:%e=\\E[5;2~:\\\n\t:%i=\\E[1;2C:kh=\\E[1~:@1=\\E[1~:kH=\\E[4~:@7=\\E[4~:\\\n\t:kN=\\E[6~:kP=\\E[5~:kI=\\E[2~:kD=\\E[3~:ku=\\EOA:kd=\\EOB:\\\n\t:kr=\\EOC:kl=\\EOD:km:', 'LESS_TERMCAP_us': '\x1b[04;38;5;111m', '_CE_CONDA': '', 'CONDA_PREFIX_1': '/nnn1/bin/anaconda3', 'PATH': '/home/ec2-user/anaconda3/bin:/home/ec2-user/miniconda3/bin:/nnn1/bin/anaconda3/envs/mpa/bin:/nnn1/bin/anaconda3/condabin:/home/ec2-user/anaconda3/bin:/home/ec2-user/miniconda3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/aws/bin:/nnn1/bin/bwa-0.7.17:/nnn1/bin/MUMmer3.23:/nnn1/bin/phylip-3.697/exe:/nnn1/bin:/home/linuxbrew/.linuxbrew/bin:/nnn1/bin/SPAdes-3.14.0-Linux/bin:/nnn1/bin/pplacer-Linux-v1.1.alpha19:/nnn1/bin/ncbi-blast-2.9.0+/bin:/home/ec2-user/.local/bin:/home/ec2-user/bin:/opt/aws/bin:/nnn1/bin/bwa-0.7.17:/nnn1/bin/MUMmer3.23:/nnn1/bin/phylip-3.697/exe:/nnn1/bin:/home/linuxbrew/.linuxbrew/bin:/nnn1/bin/SPAdes-3.14.0-Linux/bin:/nnn1/bin/pplacer-Linux-v1.1.alpha19:/nnn1/bin/ncbi-blast-2.9.0+/bin', 'MAIL': '/var/spool/mail/ec2-user', 'STY': '11057.pts-0.ip-172-31-8-161', '_': '/usr/bin/nohup', 'CONDA_PREFIX': '/nnn1/bin/anaconda3/envs/mpa', 'PWD': '/nnn2/testdrive/strainphlan', 'JAVA_HOME': '/usr/lib/jvm/java', 'LANG': 'en_US.UTF-8', 'AWS_CLOUDWATCH_HOME': '/opt/aws/apitools/mon', 'HISTCONTROL': 'ignoredups', '_CE_M': '', 'HOME': '/home/ec2-user', 'SHLVL': '2', 'AWS_PATH': '/opt/aws', 'AWS_AUTO_SCALING_HOME': '/opt/aws/apitools/as', 'LOGNAME': 'ec2-user', 'CONDA_PYTHON_EXE': '/nnn1/bin/anaconda3/bin/python', 'CVS_RSH': 'ssh', 'WINDOW': '0', 'SSH_CONNECTION': '99.124.158.179 49948 172.31.8.161 22', 'AWS_ELB_HOME': '/opt/aws/apitools/elb', 'LESSOPEN': '||/usr/bin/lesspipe.sh %s', 'CONDA_DEFAULT_ENV': 'mpa', 'LESS_TERMCAP_se': '\x1b[0m', 'TMPDIR': '/tmp'}

[e] Command '['mafft', '--quiet', '--anysymbol', '--thread', '1', '--auto', 'output/./tmp/markers/100599946840.fna']' returned non-zero exit status 1.

[e] error while aligning
    {'program_name': 'mafft', 'params': '--quiet --anysymbol --thread 1 --auto', 'version': '--version', 'command_line': '#program_name# #params# #input# > #output#', 'environment': 'TMPDIR=/tmp'}
    output/./tmp/markers/100599946840.fna
    /nnn2/testdrive/strainphlan/output/tmp/msas
    100599946840.aln

[e] Command '['mafft', '--quiet', '--anysymbol', '--thread', '1', '--auto', 'output/./tmp/markers/594436349257.fna']' returned non-zero exit status 1.

Hi,

Can you run the command giving the error:

mafft --anysymbol --thread 1 --auto output/./tmp/markers/100599946840.fna

(I just removed the --quiet param)
And report here the error message from mafft?

Many thanks,
Francesco

Hi, thanks for the reply. Looks like I’m having permissions issues.

When I run that command I get a disc space type error. But there’s plenty of disc space and if I run it as sudo it works. I’m not very familiar with working in conda environments–when I run “sudo strainphlan …” the strainphlan command isn’t recognized. If i login as root, the system doesn’t find conda. I know it’s not a strainphlan issue anymore but any suggestions on how to use strainphlan given these permissions issues? Will update if I find a solution.

$ mafft --anysymbol --thread 1 --auto output/./tmp/markers/100599946840.fna
mktemp: failed to create directory via template '/tmp/mafft.XXXXXXXXXX': No space left on device
mktemp seems to be obsolete. Re-trying without -t
mktemp: failed to create directory via template '/tmp/mafft.XXXXXXXXXX': No space left on device
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1066: /infile: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1067: /infile: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1068: /infile: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1069: /_addfile: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1070: /_aamtx: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1071: /_subalignmentstable: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1072: /_guidetree: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1073: /_seedtablefile: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1074: /_lara.params: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1075: /pdblist: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1076: /ownlist: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1077: /_externalanchors: Permission denied
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1203: /infile: No such file or directory
awk: cmd. line:1: fatal: cannot open file `/size' for reading (No such file or directory)
awk: cmd. line:1: fatal: cannot open file `/size' for reading (No such file or directory)
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1207: [: too many arguments
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1212: [: too many arguments
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1217: [: too many arguments
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1222: [: -lt: unary operator expected
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1227: [: -lt: unary operator expected
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1234: [: -lt: unary operator expected
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1241: [: -lt: unary operator expected
grep: /infile: No such file or directory
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1727: [: -gt: unary operator expected
grep: /infile: No such file or directory
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1736: [: -eq: unary operator expected
/nnn1/bin/anaconda3/envs/mpa/bin/mafft: line 1743: [: too many arguments
inputfile = orig
0 x 0 - 99999999 p
picksize = 50
groupsize = -1
sueff_global = 0.100000
inputfile = infile
At least 2 sequences should be input!
Only 0 sequence found.

Hi, the disk space issue depends on the size of the /tmp which depends on your system configuration, it can be the same partition where / is, or it can be a separated mount point.
You can verify it using the df command.

This is not related to the condo installation, though.

Reading the mafft manual, you can specify a different folder for the temp files that mafft creates while running by exporting the variable:

export TMPDIR=/path/to/my/temp/dir

and then running mafft.

Please, let us know if this solved your issue.

Many thanks,
Francesco

Dear Francesco,

I have the same problem, unfortunately “export TMPDIR=/path/to/my/temp/dir” did not help.
Your further assistance is greatly appreciated at your earliest.

Best regards,
-Mike

Dear Mike, thank you for reporting this.
It’s strange that the TMPDIR didn’t work and the issue seems to be related to disk space/permissions. So, my suggestion would be if you can try to re-run the analysis on a different disk or if you can ‘open’ the permissions settings for the analysis folder you’re currently using?

Thanks,
Francesco