The bioBakery help forum

This sequence may be truncated, or another sequence may be too long?

Hi Francesco,

I get error from my log

[e] Command '['/public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/FastTreeMP', '-quiet', '-mlacc', '2', '-slownni', '-spr', '4', '-fastest', '-mlnni', '4', '-no2nd', '-lg', '-out', '/public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales/select_bin.tre', 'output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln']' returned non-zero exit status 1.

[e] error while executing
    command_line: /public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/FastTreeMP -quiet -mlacc 2 -slownni -spr 4 -fastest -mlnni 4 -no2nd -lg -out /public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales/select_bin.tre output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln
           stdin: None
          stdout: None
             env: {'MKLROOT': '/public/software/compiler/intel/intel-compiler-2017.5.239/mkl', 'MANPATH': '/public/software/compiler/intel/intel-compiler-2017.5.239/man/en_US:/opt/gridview//pbs/dispatcher/share/man:', 'HOSTNAME': 'fat2', 'PBS_VERSION': 'TORQUE-4.2.7', 'INTEL_LICENSE_FILE': '/public/software/compiler/intel/intel-compiler-2017.5.239/licenses', 'SHELL': '/bin/bash', 'HISTSIZE': '1000', 'PBS_JOBNAME': 'job_Name', 'CONDA_SHLVL': '1', 'PERL5LIB': '/public/home/sample_lib/ckzhu/perl5/lib/perl5::/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src', 'LIBRARY_PATH': '/public/software/compiler/intel/intel-compiler-2017.5.239/compiler/lib/intel64:/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/lib/intel64:/public/software/compiler/intel/intel-compiler-2017.5.239/tbb/lib/intel64', 'CONDA_PROMPT_MODIFIER': '(phylophlan_3) ', 'PBS_ENVIRONMENT': 'PBS_BATCH', 'FPATH': '/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/include:', 'OLDPWD': '/public/home/sample_lib/ckzhu', 'QTDIR': '/usr/lib64/qt-3.3', 'PERL_MB_OPT': '--install_base /public/home/sample_lib/ckzhu/perl5', 'QTINC': '/usr/lib64/qt-3.3/include', 'PBS_O_WORKDIR': '/public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter', 'MIC_LD_LIBRARY_PATH': '/public/software/compiler/intel/intel-compiler-2017.5.239/compiler/lib/mic:/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/lib/mic:', 'CLUSCONF_HOME': '/opt/clusconf', 'QT_GRAPHICSSYSTEM_CHECKED': '1', 'USER': 'ckzhu', 'PBS_TASKNUM': '1', 'LD_LIBRARY_PATH': '/public/software/compiler/intel/intel-compiler-2017.5.239/compiler/lib/intel64:/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/lib/intel64:/public/software/compiler/intel/intel-compiler-2017.5.239/tbb/lib/intel64:/opt/gridview//pbs/dispatcher/lib::/usr/local/lib64:/usr/local/lib', 'PBS_O_HOME': '/public/home/sample_lib/ckzhu', 'CONDA_EXE': '/public/home/sample_lib/ckzhu/miniconda3/bin/conda', 'CPATH': '/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/include:', 'OFFLOAD_DEVICES': '', 'PBS_WALLTIME': '432000', 'PBS_MOMPORT': '15003', 'PBS_GPUFILE': '/opt/gridview//pbs/dispatcher/aux//1938920.admin1gpu', '_CE_CONDA': '', 'PBS_O_QUEUE': 'fat', 'NLSPATH': '/public/software/compiler/intel/intel-compiler-2017.5.239compiler/lib/intel64/locale/%l_%t/%N:/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/lib/intel64/locale/%l_%t/%N:', 'PATH': '/public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin:/public/home/sample_lib/ckzhu/miniconda3/condabin:/public/software/apps/singularity/3.5.2/bin:/public/home/sample_lib/ckzhu/software/drep/Prodigal-GoogleImport:/public/home/sample_lib/ckzhu/software/drep/MUMmer3.23:/public/home/sample_lib/ckzhu/s
oftware/mash/mash-Linux64-v2.2:/public/home/sample_lib/ckzhu/software/SGVFinder/GEM-binaries-Linux-x86_64-core_2-20121106-022124:/public/home/sample_lib/ckzhu/software/panphlan:/public/home/sample_lib/ckzhu/software/tree/tree-1.8.0:/public/software/compiler/intel/intel-compiler-2017.5.239/bin/intel64:/usr/lib64/qt-3.3/bin:/public/home/sample_lib/ckzhu/perl5/bin:/opt/gridview/pbs/dispatcher-sched/bin:/opt/gridview/pbs/dispatcher-sched/sbin:/opt/gridview/pbs/dispatcher/bin/lsf_cmd:/opt/gridview/pbs/dispatcher/bin:/opt/gridview/pbs/dispatcher/sbin:/opt/clusconf/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/public/home/sample_lib/ckzhu/miniconda3/bin:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/.local/bin:/public/home/sample_lib/ckzhu/bin:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src', 'PBS_O_LOGNAME': 'ckzhu', 'MAIL': '/var/spool/mail/ckzhu', 'NFSCONF': '/opt/clusconf/etc/nfs.cfg', 'PBS_O_LANG': 'en_US.UTF-8', 'PBS_JOBCOOKIE': 'FFF93C284D681067D7CCEABD76C831BD', 'CONDA_PREFIX': '/public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3', 'TBBROOT': '/public/software/compiler/intel/intel-compiler-2017.5.239/tbb', 'PWD': '/public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter', 'CUDA_VISIBLE_DEVICES': '', 'IPMICONF': '/opt/clusconf/etc/ipmi.cfg', 'LANG': 'en_US.UTF-8', 'PBS_NODENUM': '0', 'MODULEPATH': '/usr/share/Modules/modulefiles:/etc/modulefiles', 'PBS_NUM_NODES': '1', 'AUTOCLUSCONF': '/opt/clusconf/etc/autoconf.cfg', 'KDEDIRS': '/usr', 'LOADEDMODULES': '', 'GRIDVIEW_HOME': '/opt/gridview/', 'PBS_O_SHELL': '/bin/bash', 'PBS_JOBID': '1938920.admin1', '_CE_M': '', 'ENVIRONMENT': 'BATCH', 'HISTCONTROL': 'ignoredups', 'SSH_ASKPASS': '/usr/libexec/openssh/gnome-ssh-askpass', 'HOME': '/public/home/sample_lib/ckzhu', 'SHLVL': '2', 'PBS_O_HOST': 'login01', 'PBS_VNODENUM': '0', 'PERL_LOCAL_LIB_ROOT': ':/public/home/sample_lib/ckzhu/perl5', 'CONDA_PYTHON_EXE': '/public/home/sample_lib/ckzhu/miniconda3/bin/python', 'LOGNAME': 'ckzhu', 'STARTWAITTIME': '300', 'CVS_RSH': 'ssh', 'QTLIB': '/usr/lib64/qt-3.3/lib', 'PBS_QUEUE': 'fat', 'GPU_DEVICE_ORDINAL': '', 'XDG_DATA_DIRS': '/public/home/sample_lib/ckzhu/.local/share/flatpak/exports/share/:/var/lib/flatpak/exports/share/:/usr/local/share/:/usr/share/', 'MODULESHOME': '/usr/share/Modules', 'CONDA_DEFAULT_ENV': 'phylophlan_3', 'PBS_O_MAIL': '/var/spool/mail/ckzhu', 'PBS_MICFILE': '/opt/gridview//pbs/dispatcher/aux//1938920.admin1mic', 'LESSOPEN': '||/usr/bin/lesspipe.sh %s', 'PBS_NP': '1', 'PBS_O_SERVER': 'admin1', 'PBS_NUM_PPN': '1', 'QT_PLUGIN_PATH': '/usr/lib64/kde4/plugins:/usr/lib/kde4/plugins', 'INCLUDE': '/public/software/compiler/intel/intel-compiler-2017.5.239/mkl/include:', 'PBS_NODEFILE': '/opt/gridview//pbs/dispatcher/aux//1938920.admin1', 'PERL_MM_OPT': 'INSTALL_BASE=/public/home/sample_lib/ckzhu/perl5', 'PBS_O_PATH': '/public/software/apps/singularity/3.5.2/bin/:/public/home/sample_lib/ckzhu/software/drep/Prodigal-GoogleImport:/public/home/sample_lib/ckzhu/software/drep/MUMmer3.23:/public/home/sample_lib/ckzhu/software/mash/mash-Linux64-v2.2:/public/home/sample_lib/ckzhu/software/SGVFinder/GEM-binaries-Linux-x86_64-core_2-20121106-022124:/public/home/sample_lib/ckzhu/software/panphlan/:/public/home/sample_lib/ckzhu/software/tree/tree-1.8.0:/public/software/compiler/intel/intel-compiler-2017.5.239/bin/intel64:/usr/lib64/qt-3.3/bin:/public/home/sample_lib/ckzhu/perl5/bin:/opt/gridview//pbs/dispatcher-sched/bin:/opt/gridview//pbs/dispatcher-sched/sbin:/opt/gridview//pbs/dispatcher/bin/lsf_cmd:/opt/gridview//pbs/dispatcher/bin:/opt/gridview//pbs/dispatcher/sbin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/public/home/sample_lib/ckzhu/miniconda3/bin:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/.local/bin:/public/home/sample_lib/ckzhu/bin:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src:/public/home/sample_lib/ckzhu/software/MOCAT2/MOCAT/src', 'BASH_FUNC_module()': '() {  eval `/usr/bin/modulecmd bash $*`\n}', '_': '../../phylophlan.py', 'OMP_NUM_THREADS': '3'}

Then I run

(phylophlan_3) [ckzhu@vm-login02 05_campylobacter]$ /public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/FastTreeMP -quiet -mlacc 2 -slownni -spr 4 -fastest -mlnni 4 -no2nd -lg -out /public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales/select_bin.tre output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln
Wrong number of characters for ERR589575_bin_10: expected 44987 but have 34559 instead.
This sequence may be truncated, or another sequence may be too long.

It seems the issue from fasttree, but I search this issue from google, there is no answer, so could you give me some advise?

I change the config file, fasttree to iqtree, it worked, but I meet another issues

[e] Command '['/public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/raxmlHPC-PTHREADS-SSE3', '-p', '1989', '-m', 'PROTCATLG', '#threads#', '-t', 'output_Clostridiales/output_Clostridiales/select_bin_resolved.tre', '-w', '/public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales', '-s', 'output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln', '-n', 'select_bin_refined.tre']' returned non-zero exit status 255.

[e] error while executing
    command_line: /public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/raxmlHPC-PTHREADS-SSE3 -p 1989 -m PROTCATLG #threads# -t output_Clostridiales/output_Clostridiales/select_bin_resolved.tre -w /public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales -s output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln -n select_bin_refined.tre
           stdin: None
          stdout: None

when I run

(phylophlan_3) [ckzhu@vm-login01 05_campylobacter]$ /public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/raxmlHPC-PTHREADS-SSE3 -p 1989 -m PROTCATLG #threads# -t output_Clostridiales/output_Clostridiales/select_bin_resolved.tre -w /public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales -s output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln -n select_bin_refined.tre

WARNING: The number of threads is currently set to 0
You can specify the number of threads to run via -T numberOfThreads
NumberOfThreads must be set to an integer value greater than 1

RAxML, will now set the number of threads automatically to 2 !


 Error: please specify a name for this run with -n

Then I remove #theards#

(phylophlan_3) [ckzhu@vm-login01 05_campylobacter]$ /public/home/sample_lib/ckzhu/miniconda3/envs/phylophlan_3/bin/raxmlHPC-PTHREADS-SSE3 -p 1989 -m PROTCATLG -t output_Clostridiales/output_Clostridiales/select_bin_resolved.tre -w /public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales -s output_Clostridiales/output_Clostridiales/select_bin_concatenated.aln -n select_bin_refined.tre

WARNING: The number of threads is currently set to 0
You can specify the number of threads to run via -T numberOfThreads
NumberOfThreads must be set to an integer value greater than 1

RAxML, will now set the number of threads automatically to 2 !

Warning, you specified a working directory via "-w"
Keep in mind that RAxML only accepts absolute path names, not relative ones!

RAxML can't, parse the alignment file as phylip file 
it will now try to parse it as FASTA file

RAxML output files with the run ID <select_bin_refined.tre> already exist 
in directory /public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/05_campylobacter/output_Clostridiales/output_Clostridiales/ ...... exiting

  • the issue says, I have select_bin_refined.tre in my file, but I can’t find it.
[ckzhu@vm-login02 output_Clostridiales]$ tree -L 1
.
β”œβ”€β”€ RAxML_info.select_bin_refined.tre
β”œβ”€β”€ select_bin_concatenated.aln
β”œβ”€β”€ select_bin.tre.bionj
β”œβ”€β”€ select_bin.tre.ckp.gz
β”œβ”€β”€ select_bin.tre.iqtree
β”œβ”€β”€ select_bin.tre.log
β”œβ”€β”€ select_bin.tre.mldist
β”œβ”€β”€ select_bin.tre.treefile
└── tmp

here is my input file select_bin_concatenated.aln

I did not find this module, is this the problem?

-m
Model of Binary (Morphological), Nucleotide, Multi-State, or Amino Acid Substitution:
BINARY:
"-m BINCAT[X]"
: Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under BINGAMMA, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m BINCATI[X]"
: Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under BINGAMMAI, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_BINCAT[X]"
: Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under BINGAMMA, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies. The ASC prefix willl correct the likelihood for ascertainment bias.
"-m BINGAMMA[X]"
: GAMMA model of rate heterogeneity (alpha parameter will be estimated).
With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_BINGAMMA[X]" : GAMMA model of rate heterogeneity (alpha parameter will be estimated).
The ASC prefix willl correct the likelihood for ascertainment bias. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m BINGAMMAI[X]"
: Same as BINGAMMA, but with estimate of proportion of invariable sites.
With the optional "X" appendix you can specify a ML estimate of base frequencies.
NUCLEOTIDES:
"-m GTRCAT[X]"
: GTR + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated under GTRGAMMA, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m GTRCATI[X]"
: GTR + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated under GTRGAMMAI, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_GTRCAT[X]"
: GTR + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated under GTRGAMMA, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies. The ASC prefix willl correct the likelihood for ascertainment bias.
"-m GTRGAMMA[X]"
: GTR + Optimization of substitution rates + GAMMA model of rate
heterogeneity (alpha parameter will be estimated).
With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_GTRGAMMA[X]" : GTR + Optimization of substitution rates + GAMMA model of rate
heterogeneity (alpha parameter will be estimated). The ASC prefix willl correct the likelihood for ascertainment bias. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m GTRGAMMAI[X]"
: Same as GTRGAMMA, but with estimate of proportion of invariable sites.
With the optional "X" appendix you can specify a ML estimate of base frequencies.
MULTI-STATE:
"-m MULTICAT[X]"
: Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under MULTIGAMMA, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m MULTICATI[X]"
: Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under MULTIGAMMAI, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_MULTICAT[X]"
: Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under MULTIGAMMA, depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies. The ASC prefix willl correct the likelihood for ascertainment bias.
"-m MULTIGAMMA[X]"
: GAMMA model of rate heterogeneity (alpha parameter will be estimated).
With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_MULTIGAMMA[X]" : GAMMA model of rate heterogeneity (alpha parameter will be estimated).
The ASC prefix willl correct the likelihood for ascertainment bias. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m MULTIGAMMAI[X]"
: Same as MULTIGAMMA, but with estimate of proportion of invariable sites.
With the optional "X" appendix you can specify a ML estimate of base frequencies.
You can use up to 32 distinct character states to encode multi-state regions, they must be used in the following order: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V i.e., if you have 6 distinct character states you would use 0, 1, 2, 3, 4, 5 to encode these. The substitution model for the multi-state regions can be selected via the "-K" option
AMINO ACIDS:
"-m PROTCATmatrixName[F|X]"
: specified AA matrix + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under PROTGAMMAmatrixName[F|X], depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m PROTCATImatrixName[F|X]"
: specified AA matrix + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under PROTGAMMAImatrixName[F|X], depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_PROTCATmatrixName[F|X]"
: specified AA matrix + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct
rate categories for greater computational efficiency. Final tree might be evaluated automatically under PROTGAMMAmatrixName[F|X], depending on the tree search option. With the optional "X" appendix you can specify a ML estimate of base frequencies. The ASC prefix willl correct the likelihood for ascertainment bias.
"-m PROTGAMMAmatrixName[F|X]"
: specified AA matrix + Optimization of substitution rates + GAMMA model of rate
heterogeneity (alpha parameter will be estimated).
With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m ASC_PROTGAMMAmatrixName[F|X]" : specified AA matrix + Optimization of substitution rates + GAMMA model of rate
heterogeneity (alpha parameter will be estimated). The ASC prefix willl correct the likelihood for ascertainment bias. With the optional "X" appendix you can specify a ML estimate of base frequencies.
"-m PROTGAMMAImatrixName[F|X]"
: Same as PROTGAMMAmatrixName[F|X], but with estimate of proportion of invariable sites.
With the optional "X" appendix you can specify a ML estimate of base frequencies.
Available AA substitution models: DAYHOFF, DCMUT, JTT, MTREV, WAG, RTREV, CPREV, VT, BLOSUM62, MTMAM, LG, MTART, MTZOA, PMB, HIVB, HIVW, JTTDCMUT, FLU, STMTREV, DUMMY, DUMMY2, AUTO, LG4M, LG4X, PROT_FILE, GTR_UNLINKED, GTR With the optional "F" appendix you can specify if you want to use empirical base frequencies. AUTOF and AUTOX are not supported any more, if you specify AUTO it will test prot subst. models with and without empirical base frequencies now! Please note that for partitioned models you can in addition specify the per-gene AA model in the partition file (see manual for details). Also note that if you estimate AA GTR parameters on a partitioned dataset, they will be linked (estimated jointly) across all partitions to avoid over-parametrization

Can you please provide the full log (with --verbose) from PhyloPhlAn? The truncated sequence might be due to some identical IDs

It is this one, RAxML uses RAxML_<something>. as a prefix and if it founds a file existing it stops the execution.

You do have it, it is:

"-m PROTCATmatrixName[F|X]"

where matrixName is LG and no F nor X is used.

I am sure the question is from fasttree,but very strange,I pick out this bin (ERR589575_bin_10.fna)

(phylophlan_3) [ckzhu@vm-login01 temp]$ fasttree -nt ERR589575_bin_10.fna >tree
FastTree Version 2.1.10 Double precision (No SSE3)
Alignment: ERR589575_bin_10.fna
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Jukes-Cantor, CAT approximation with 20 rate categories
Wrong number of characters for NODE_121_length_61670_cov_6.599792: expected 66715 but have 61670 instead.
This sequence may be truncated, or another sequence may be too long.

Wrong number of characters for NODE_121_length_61670_cov_6.599792: expected 66715 but have 61670 instead.

I don’t know why fasttree will make misplaced?

ERR589575_bin_10.txt (2.2 MB)

  • iqtree config file
phylophlan_write_config_file  \
    -d a \
    -o Clostridiales_config.cfg \
    --db_aa diamond \
    --map_dna diamond \
    --map_aa diamond \
    --msa mafft \
    --trim trimal \
    --tree1 iqtree \
    --tree2 raxml \
    --verbose 2>&1 >Clostridiales_config.log
phylophlan.py -i select_bin \
    -d phylophlan \
    --diversity medium \
    --accurate \
    -f Clostridiales_config.cfg \
    -o output_Clostridiales \
    --output_folder output_Clostridiales \
    --nproc 60 \
    -t a \
    --verbose >Clostridiales_bin.log  2>&1
  • result of iqtree
.
β”œβ”€β”€ RAxML_info.select_bin_refined.tre
β”œβ”€β”€ select_bin_concatenated.aln
β”œβ”€β”€ select_bin.tre.bionj
β”œβ”€β”€ select_bin.tre.ckp.gz
β”œβ”€β”€ select_bin.tre.iqtree
β”œβ”€β”€ select_bin.tre.log
β”œβ”€β”€ select_bin.tre.mldist
β”œβ”€β”€ select_bin.tre.treefile
└── tmp
  • fasttree config file
phylophlan_write_config_file  \
    -d a \
    -o Clostridiales_config.cfg \
    --db_aa diamond \
    --map_dna diamond \
    --map_aa diamond \
    --msa mafft \
    --trim trimal \
    --tree1 fasttree \
    --tree2 raxml \
    --verbose 2>&1 >Clostridiales_config.log
β”œβ”€β”€ select_bin_concatenated.aln
β”œβ”€β”€ select_bin_concatenated.aln.reduced
β”œβ”€β”€ select_bin_resolved.tre
β”œβ”€β”€ select_bin.tre
β”œβ”€β”€ RAxML_info.input_bins_refined.tre
β”œβ”€β”€ RAxML_log.input_bins_refined.tre
β”œβ”€β”€ RAxML_result.input_bins_refined.tre
└── tmp

the resuts of iqtree and fasttree is different , and RaxML need select_bin_resolved.tre , but this file is missing in iqtree, so the raxml program cannot run

Thanks for reporting this. Indeed there was a bug when IQ-TREE is used as tree1 and RAxML as tree2. This should be now solved with the commit ID d614c02.
These changes are not yet available through Bioconda, so, will you be able to pull the latest code from the GitHub repository to test your execution?

Many thanks,
Francesco

Hi Francesco
I use your new version execution without any error!!!
Thanks very much!!!

Building phylogeny "output_Clostridiales/iqtree/select_bin_all_concatenated.aln"
Phylogeny "select_bin_all.tre" built in 8917s
Resolving 1 polytomies
Resolving polytomies for "output_Clostridiales/iqtree/select_bin_all.tre.treefile"
"output_Clostridiales/iqtree/select_bin_all_resolved.tre" generated in 0s
Refining phylogeny "output_Clostridiales/iqtree/select_bin_all_resolved.tre"
Reducing number of RAxML threads to 20, as it appears to underperform with more threads
Phylogeny "select_bin_all_refined.tre" refined in 1988s

Total elapsed time 15883s

1 Like