Running time during mapping with diamond

Hi developers,

I would like to know if I am making a mistake that make the mapping a very slow process. After 24h, I killed the process because only one genome has been mapped.

My code:
phylophlan_write_config_file -o supermatrix_aa.cfg -d a --force_nucleotides --db_aa diamond --map_dna diamond --map_aa diamond --msa muscle --trim trimal --tree1 fasttree --tree2 raxml

phylophlan -i Syn_genomes -o output_Syn_genomes -d phylophlan -t a -f supermatrix_aa.cfg --diversity low --force_nucleotides --nproc 4 --verbose 2>&1 | tee phylophlan_output/phylophlan_out.log

Log_output_info::
apping “output_Syn_genomes/tmp/clean_dna/LA31-GCA_018502385.1.fna”
“CB0205-GCA_000179255.1.b6o.bkp” generated in 1248s
Removing “output_Syn_genomes/tmp/map_dna/LA31-GCA_018502385.1.b6o.bkp”

[e] Command ‘[’/opt/anaconda3/envs/phylophlan/bin/diamond’, ‘blastx’, ‘–quiet’, ‘–threads’, ‘1’, ‘–outfmt’, ‘6’, ‘–more-sensitive’, ‘–id’, ‘50’, ‘–max-hsps’, ‘35’, ‘-k’, ‘0’, ‘–query-gencode’, ‘11’, ‘–query’, ‘output_Syn_genomes/tmp/clean_dna/LA31-GCA_018502385.1.fna’, ‘–db’, ‘phylophlan_databases/phylophlan/phylophlan.dmnd’, ‘–out’, ‘output_Syn_genomes/tmp/map_dna/LA31-GCA_018502385.1.b6o.bkp’]’ died with <Signals.SIGKILL: 9>.

[e] cannot execute command
command_line: /opt/anaconda3/envs/phylophlan/bin/diamond blastx --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0 --query-gencode 11 --query output_Syn_genomes/tmp/clean_dna/LA31-GCA_018502385.1.fna --db phylophlan_databases/phylophlan/phylophlan.dmnd --out output_Syn_genomes/tmp/map_dna/LA31-GCA_018502385.1.b6o.bkp
stdin: None
stdout: None
env: {‘TERM_PROGRAM’: ‘Apple_Terminal’, ‘SHELL’: ‘/bin/zsh’, ‘TERM’: ‘xterm-256color’, ‘TMPDIR’: ‘/var/folders/z6/2b8qw16n2fd5hg1jrhf358z80000gn/T/’, ‘CONDA_SHLVL’: ‘2’, ‘CONDA_PROMPT_MODIFIER’: '(phylophlan) ', ‘TERM_PROGRAM_VERSION’: ‘444’, ‘TERM_SESSION_ID’: ‘77A99F7E-2B2E-40EB-A839-24F84EC424BC’, ‘USER’: ‘joelsanchez’, ‘CONDA_EXE’: ‘/opt/anaconda3/bin/conda’, ‘SSH_AUTH_SOCK’: ‘/private/tmp/com.apple.launchd.P2Ejd0CSjE/Listeners’, ‘CE_CONDA’: ‘’, ‘CONDA_PREFIX_1’: ‘/opt/anaconda3’, ‘PATH’: ‘/opt/anaconda3/envs/phylophlan/bin:/opt/anaconda3/condabin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/ncbi/blast/bin:/opt/X11/bin’, '’: ‘/opt/anaconda3/envs/phylophlan/bin/phylophlan’, ‘CONDA_PREFIX’: ‘/opt/anaconda3/envs/phylophlan’, ‘__CFBundleIdentifier’: ‘com.apple.Terminal’, ‘PWD’: ‘/Users/joelsanchez/phylophlan/phylophlan’, ‘LANG’: ‘en_US.UTF-8’, ‘XPC_FLAGS’: ‘0x0’, ‘_CE_M’: ‘’, ‘XPC_SERVICE_NAME’: ‘0’, ‘SHLVL’: ‘1’, ‘HOME’: ‘/Users/joelsanchez’, ‘CONDA_PYTHON_EXE’: ‘/opt/anaconda3/bin/python’, ‘LOGNAME’: ‘joelsanchez’, ‘CONDA_DEFAULT_ENV’: ‘phylophlan’, ‘__CF_USER_TEXT_ENCODING’: ‘0x1F5:0x0:0x0’}

[e] Command ‘[’/opt/anaconda3/envs/phylophlan/bin/diamond’, ‘blastx’, ‘–quiet’, ‘–threads’, ‘1’, ‘–outfmt’, ‘6’, ‘–more-sensitive’, ‘–id’, ‘50’, ‘–max-hsps’, ‘35’, ‘-k’, ‘0’, ‘–query-gencode’, ‘11’, ‘–query’, ‘output_Syn_genomes/tmp/clean_dna/LA31-GCA_018502385.1.fna’, ‘–db’, ‘phylophlan_databases/phylophlan/phylophlan.dmnd’, ‘–out’, ‘output_Syn_genomes/tmp/map_dna/LA31-GCA_018502385.1.b6o.bkp’]’ died with <Signals.SIGKILL: 9>.

[e] error while mapping
{‘program_name’: ‘/opt/anaconda3/envs/phylophlan/bin/diamond’, ‘params’: ‘blastx --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0 --query-gencode 11’, ‘input’: ‘–query’, ‘database’: ‘–db’, ‘output’: ‘–out’, ‘version’: ‘version’, ‘command_line’: ‘#program_name# #params# #input# #database# #output#’}
output_Syn_genomes/tmp/clean_dna/LA31-GCA_018502385.1.fna
phylophlan_databases/phylophlan/phylophlan.dmnd
output_Syn_genomes/tmp/map_dna
LA31-GCA_018502385.1.b6o.bkp
True

[e] Command ‘[’/opt/anaconda3/envs/phylophlan/bin/diamond’, ‘blastx’, ‘–quiet’, ‘–threads’, ‘1’, ‘–outfmt’, ‘6’, ‘–more-sensitive’, ‘–id’, ‘50’, ‘–max-hsps’, ‘35’, ‘-k’, ‘0’, ‘–query-gencode’, ‘11’, ‘–query’, ‘output_Syn_genomes/tmp/clean_dna/LA31-GCA_018502385.1.fna’, ‘–db’, ‘phylophlan_databases/phylophlan/phylophlan.dmnd’, ‘–out’, ‘output_Syn_genomes/tmp/map_dna/LA31-GCA_018502385.1.b6o.bkp’]’ died with <Signals.SIGKILL: 9>.

[e] gene_markers_identification crashed

Thanks,

Joel

Hi Joel,

Thanks for writing to us. So, the slowness could be due to the diamond version. Version 2 is actually much slower than version 1 because a bug was fixed, you can read more about it here: Announcing HUMAnN 3.6 (Critical Update).

From the log it seems that one genome was mapped in ~20m, right? But the error after that is due to you killing PhyloPhlAn because too slow?
Also, you’re specifying 4 cores, which means 4 input genomes will be mapped at the same time, to speed up the process you can specify more (if you can) so that more genomes will be mapped in the same amount of time.

Thanks a lot,
Francesco