Error message returned from diamond

Hello,

when trying to run HUMAnN3 I retrieve an error message returned from diamond:

Error message returned from diamond :
diamond v0.9.24.125 | by Benjamin Buchfink buchfink@gmail.com
Licensed under the GNU GPL https://www.gnu.org/licenses/gpl.txt
Check http://github.com/bbuchfink/diamond for updates.

#CPU threads: 4
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: Text file busy
Error: Error calling unlink.

06/29/2020 03:37:26 PM - humann.utilities - CRITICAL: TRACEBACK:
Traceback (most recent call last):
File “/home/plicht/anaconda3/envs/metaphlan/lib/python3.7/site-packages/humann/utilities.py”, line 744, in execute_command
p_out = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
File “/home/plicht/anaconda3/envs/metaphlan/lib/python3.7/subprocess.py”, line 411, in check_output
**kwargs).stdout
File “/home/plicht/anaconda3/envs/metaphlan/lib/python3.7/subprocess.py”, line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[’/home/plicht/anaconda3/envs/metaphlan/bin/diamond’, ‘blastx’, ‘–query’, ‘/media/sf_projects/microbiome/Analysis_of_microbiome/WiP/KneadData/firsttry/HUMAnN3/output/TRIAL_PL018_1_Novogenea1_1/TRIAL_PL018_1_Novogenea1_1_humann_temp/TRIAL_PL018_1_Novogenea1_1_bowtie2_unaligned.fa’, ‘–evalue’, ‘1.0’, ‘–threads’, ‘4’, ‘–top’, ‘1’, ‘–outfmt’, ‘6’, ‘–db’, ‘/media/sf_projects/microbiome/Analysis_of_microbiome/BioBakery-Tools/Databases/HUMAnN_db/uniref/uniref90_201901’, ‘–out’, ‘/media/sf_projects/microbiome/Analysis_of_microbiome/WiP/KneadData/firsttry/HUMAnN3/output/TRIAL_PL018_1_Novogenea1_1/TRIAL_PL018_1_Novogenea1_1_humann_temp/tmpln4cfsa3/diamond_m8_18tgcgi4’, ‘–tmpdir’, ‘/media/sf_projects/microbiome/Analysis_of_microbiome/WiP/KneadData/firsttry/HUMAnN3/output/TRIAL_PL018_1_Novogenea1_1/TRIAL_PL018_1_Novogenea1_1_humann_temp/tmpln4cfsa3’]’ returned non-zero exit status 1.

I use HUMAnN3 alpha3 with Diamond v0.9.24.125 installed via conda, and as databases:
-uniref full ($ humann_databases --download uniref uniref90_diamond )

  • ChocoPhlAn full v296_201901 ($ humann_databases –download chocophlan full )
  • Uility mapping full ($ humann_databases --download utility_mapping full )

The functional tests with $ humann_test --run-functional-tests-tools --run-functional-tests-end-to-end are fine. Also when running the demo.fastq with the demo databases HUMAnN3 is working properly. Can you help me out?

Sorry for the delay here. Everything about the installation seems fine. This might be an issue with the I/O in your computer environment (per the “file busy” error). You could try re-running, or using another location to store the outputs?

Hi Eric,

thanks for your suggestions. Indeed, I guess the I/O directory used could be a problem since I am currently running a virtual ubuntu machine with outrunning disk space. Thtas why I located the databases as well as I/O files on an external HDD.
However, in the meantime I was able to get the task running with an updated diamond from v0.9.24.125 to v0.9.36.137. The command works well through with full Chocophlan_db and demo protein_db uniref90_demo_prots_v201901. However, when using the full uniref90_db, the command dies with <Signals.SIGKILL: 9>. I googled and watched my RAM and I guess it’s caused by running out of memory. The virtual machine is allocated with 8 GB RAM. Is there a general formula of minumum requirements of HUMAnN3?

In our evaluations on metagenomes with 30M reads, peak RAM usage varied from 16GB to 24GB depending on the balance between nucleotide and translated search. The memory ceiling is typically smaller outside of translated search, though this can also depend on the complexity of the microbial community under study.

Hi @franzosa ,

I have same problem with diamond with error “Signals.SIGKILL: 9”

But it looks like I don’t have a problem with the memory:

Processing query block 1, reference block 6/15, shape 2/2, index chunk 3/4.

Building reference seed array… [8.452s]
Building query seed array… [4.357s]
Computing hash join… [1.761s]
Building seed filter… [0.071s]
Searching alignments… [5.834s]
Processing query block 1, reference block 6/15, shape 2/2, index chunk 4/4.
Building reference seed array… [7.515s]
Building query seed array… [3.126s]
Computing hash join… [1.935s]
Building seed filter… [0.072s]
Searching alignments… [5.277s]
Deallocating buffers… [0.018s]
Clearing query masking… [0.888s]
Opening temporary output file… [0.048s]
Computing alignments…
12/13/2020 12:10:58 AM - humann.utilities - CRITICAL: TRACEBACK:
Traceback (most recent call last):
File “/home/daia1/anaconda3/envs/py37/lib/python3.7/site-packages/humann/utilities.py”, line 756, in execute_command
p_out = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
File “/home/daia1/anaconda3/envs/py37/lib/python3.7/subprocess.py”, line 411, in check_output
**kwargs).stdout
File “/home/daia1/anaconda3/envs/py37/lib/python3.7/subprocess.py”, line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[’/home/daia1/anaconda3/envs/py37/bin/diamond’, ‘blastx’, ‘–query’, ‘/home/daia1/my_workdir/samples/CART_2050A_humann3_humann_temp_rgp4wrgk/CART_2050A_humann3_bowtie2_unaligned.fa’, ‘–evalue’, ‘1.0’, ‘–threads’, ‘16’, ‘–top’, ‘1’, ‘–outfmt’, ‘6’, ‘–db’, ‘/home/daia1/my_workdir/ref_db/uniref/uniref/uniref/uniref90_201901’, ‘–out’, ‘/home/daia1/my_workdir/samples/CART_2050A_humann3_humann_temp_rgp4wrgk/tmpqzcl2fvw/diamond_m8_pgyuivvi’, ‘–tmpdir’, ‘/home/daia1/my_workdir/samples/CART_2050A_humann3_humann_temp_rgp4wrgk/tmpqzcl2fvw’]’ died with <Signals.SIGKILL: 9>.

12/13/2020 12:10:58 AM - humann.utilities - INFO: Total memory = 503.5974884033203 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Available memory = 366.03315353393555 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Free memory = 364.29954528808594 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Percent memory used = 27.3 %
12/13/2020 12:10:58 AM - humann.utilities - INFO: CPU percent = 46.2 %
12/13/2020 12:10:58 AM - humann.utilities - INFO: Total cores count = 72
12/13/2020 12:10:58 AM - humann.utilities - INFO: Total disk = 159.56462860107422 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Used disk = 31.365806579589844 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Percent disk used = 19.7 %
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process create time = 2020-12-12 22:16:46
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process user time = 2442.87 seconds
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process system time = 83.03 seconds
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process CPU percent = 0.0 %
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process memory RSS = 14.47745132446289 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process memory VMS = 14.610565185546875 GB
12/13/2020 12:10:58 AM - humann.utilities - INFO: Process memory percent = 2.8748061016674917 %

I have a total of 67 samples, 54 successfully finished with the same code, but the remaining 13 just stalled and never finished. I have enough disk space also.
Do you have any idea what could be causing the problem?

Anqi

That error suggests that the system told the process to stop. It could be something like running out of time (in a cluster environment) or the system restarting?

I can reproduce this error with 10% of my files.
I tried to rerun the failing ones with same settings (in case the filesystem just had a bad day) but that did not change the outcome.
For me the pipeline dies at the step Diamond search.

CRITICAL ERROR: Error executing: anaconda3/envs/biobakery3/bin/diamond blastx --query [...]

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: [...]/tmp5bljfj5d
Percentage range of top alignment score to report hits: 1
Opening the database...  [0.117s]
Database: [...]/biobakery3/uniref/uniref90_201901b_full.dmnd (type: Diamond database, sequences: 87296736, letters: 29247941583)
Block size = 2000000000
Opening the input file...  [0.06s]
Opening the output file...  [0s]
Loading query sequences...  [30.558s]
Masking queries...  [39.426s]
Algorithm: Double-indexed
Building query histograms...  [8.431s]
Allocating buffers...  [0s]
Loading reference sequences...  [6.507s]
Masking reference...  [31.677s]
Initializing dictionary...  [0.011s]
Initializing temporary storage...  [0.008s]
Building reference histograms...  [12.965s]
Allocating buffers...  [0s]
Processing query block 1, reference block 1/15, shape 1/2, index chunk 1/4.
Building reference seed array...  [9.151s]
Building query seed array...  [6.171s]
Computing hash join...  [5.577s]
Building seed filter...  [0.173s]
Searching alignments... 
[Dies here]

I’m running this on a cluster environment (Slurm) which however is neither running out of time nor reporting out-of-memory error. It’s possible that I’m missing a file-system error. However it definitely can’t be due to lack of storage space.

Any new ideas on that issue?

Best,
Len

Which version of DIAMOND are you running?

Err: b’CRITICAL ERROR: Error executing: /home/microviable/workflows/bin/diamond blastx --query /media/microviable/e/workflows_output/humann/main/BIKE012A_humann_temp/BIKE012A_bowtie2_unaligned.fa --evalue 1.0 --threads 2 --top 1 --outfmt 6 --db /home/microviable/biobakery_workflows_databases/humann/uniref/uniref90_201901 --out /media/microviable/e/workflows_output/humann/main/BIKE012A_humann_temp/tmpcww9gqwq/diamond_m8_f94rntm6 --tmpdir /media/microviable/e/workflows_output/humann/main/BIKE012A_humann_temp/tmpcww9gqwq

Error message returned from diamond :
diamond v0.9.24.125 | by Benjamin Buchfink buchfink@gmail.com
Licensed under the GNU GPL https://www.gnu.org/licenses/gpl.txt
Check GitHub - bbuchfink/diamond: Accelerated BLAST compatible local sequence aligner. for updates.

#CPU threads: 2
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /media/microviable/e/workflows_output/humann/main/BIKE012A_humann_temp/tmpcww9gqwq
Opening the database… [0.000261s]
Percentage range of top alignment score to report hits: 1
Opening the input file… [5.6e-05s]
Opening the output file… [9.5e-05s]
Loading query sequences… [13.4343s]
Masking queries… [186.538s]
Building query seed set… [0.049379s]
Algorithm: Double-indexed
Building query histograms… [8.47118s]
Allocating buffers… [5.4e-05s]
Loading reference sequences… [2.24639s]
Building reference histograms… [11.7816s]
Allocating buffers… [1.8e-05s]
Initializing temporary storage… [0.060473s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 0.
Building reference index… [22.3248s]
Building query index… [12.4941s]
Building seed filter… [2.11717s]
Searching alignments… [80.3787s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 1.
Building reference index… [24.3983s]
Building query index… [13.8459s]
Building seed filter… [2.11284s]
Searching alignments… [74.069s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 2.
Building reference index… [25.6765s]
Building query index… [14.534s]
Building seed filter… [2.11479s]
Searching alignments… [76.5505s]
Processing query chunk 0, reference chunk 0, shape 0, index chunk 3.
Building reference index… [22.6031s]
Building query index… [12.8757s]
Building seed filter… [2.20768s]
Searching alignments… [73.5638s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 0.
Building reference index… [22.2809s]
Building query index… [11.9711s]
Building seed filter… [2.15985s]
Searching alignments… [57.2694s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 1.
Building reference index… [25.3052s]
Building query index… [13.3127s]
Building seed filter… [2.11888s]
Searching alignments… [56.0851s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 2.
Building reference index… [26.0394s]
Building query index… [14.0035s]
Building seed filter… [2.12631s]
Searching alignments… [56.0188s]
Processing query chunk 0, reference chunk 0, shape 1, index chunk 3.
Building reference index… [22.3317s]
Building query index… [11.9944s]
Building seed filter… [2.12719s]
Searching alignments… [55.9832s]
Deallocating buffers… [0.188846s]
Opening temporary output file… [0.018851s]
Computing alignments… [228.529s]
Deallocating reference… [0.072216s]
Loading reference sequences… [2.34065s]
Building reference histograms… [12.3993s]
Allocating buffers… [1.7e-05s]
Initializing temporary storage… [0.033754s]
Processing query chunk 0, reference chunk 1, shape 0, index chunk 0.
Building reference index… [23.3182s]
Building query index… [12.9526s]
Building seed filter… [2.17436s]
Searching alignments… [85.6714s]
Processing query chunk 0, reference chunk 1, shape 0, index chunk 1.
Building reference index… [25.4353s]
Building query index… [14.3335s]
Building seed filter… [2.17088s]
Searching alignments… [81.9938s]
Processing query chunk 0, reference chunk 1, shape 0, index chunk 2.
Building reference index… [26.0562s]
Building query index… [14.6713s]
Building seed filter… [2.145s]
Searching alignments… [78.2425s]
Processing query chunk 0, reference chunk 1, shape 0, index chunk 3.
Building reference index… [22.0205s]
Building query index…

I tried to update diamond and human in a different conda environment and I received this

humann -i BURGOS023A_6_100K.fastq.gz -o /media/microviable/e/gdrive/Microviable/ShotgunMetagenomeRarefaction/humann 
Output files will be written to: /media/microviable/e/gdrive/Microviable/ShotgunMetagenomeRarefaction/humann
Decompressing gzipped file ...


Running metaphlan ........

Found g__Prevotella.s__Prevotella_copri : 37.86% of mapped reads
Found g__Ruminococcus.s__Ruminococcus_bromii : 21.86% of mapped reads
Found g__Butyrivibrio.s__Butyrivibrio_crossotus : 6.18% of mapped reads
Found g__Roseburia.s__Roseburia_faecis : 5.86% of mapped reads
Found g__Faecalibacterium.s__Faecalibacterium_prausnitzii : 4.94% of mapped reads
Found g__Prevotella.s__Prevotella_sp_885 : 4.71% of mapped reads
Found g__Catenibacterium.s__Catenibacterium_mitsuokai : 3.28% of mapped reads
Found g__Collinsella.s__Collinsella_aerofaciens : 3.21% of mapped reads
Found g__Dorea.s__Dorea_longicatena : 2.84% of mapped reads
Found g__Lachnospiraceae_unclassified.s__Eubacterium_rectale : 2.19% of mapped reads
Found g__Blautia.s__Ruminococcus_torques : 1.47% of mapped reads
Found g__Blautia.s__Blautia_wexlerae : 1.18% of mapped reads
Found g__Prevotella.s__Prevotella_sp_CAG_891 : 0.79% of mapped reads
Found g__Prevotella.s__Prevotella_sp_CAG_279 : 0.60% of mapped reads
Found g__Holdemanella.s__Holdemanella_biformis : 0.60% of mapped reads
Found g__Ruminococcus.s__Ruminococcus_lactaris : 0.60% of mapped reads
Found g__Bifidobacterium.s__Bifidobacterium_longum : 0.55% of mapped reads
Found g__Coprococcus.s__Coprococcus_comes : 0.52% of mapped reads
Found g__Phascolarctobacterium.s__Phascolarctobacterium_succinatutens : 0.52% of mapped reads
Found g__Slackia.s__Slackia_isoflavoniconvertens : 0.17% of mapped reads
Found g__Eubacterium.s__Eubacterium_hallii : 0.06% of mapped reads

Total species selected from prescreen: 21

Selected species explain 100.00% of predicted community composition


Creating custom ChocoPhlAn database ........


Running bowtie2-build ........


Running bowtie2 ........

Total bugs from nucleotide alignment: 21
g__Prevotella.s__Prevotella_copri: 19288 hits
g__Roseburia.s__Roseburia_faecis: 538 hits
g__Ruminococcus.s__Ruminococcus_bromii: 5227 hits
g__Catenibacterium.s__Catenibacterium_mitsuokai: 779 hits
g__Prevotella.s__Prevotella_sp_885: 4702 hits
g__Blautia.s__Ruminococcus_torques: 781 hits
g__Prevotella.s__Prevotella_sp_CAG_891: 119 hits
g__Butyrivibrio.s__Butyrivibrio_crossotus: 390 hits
g__Coprococcus.s__Coprococcus_comes: 226 hits
g__Faecalibacterium.s__Faecalibacterium_prausnitzii: 1078 hits
g__Dorea.s__Dorea_longicatena: 394 hits
g__Lachnospiraceae_unclassified.s__Eubacterium_rectale: 508 hits
g__Holdemanella.s__Holdemanella_biformis: 60 hits
g__Blautia.s__Blautia_wexlerae: 311 hits
g__Phascolarctobacterium.s__Phascolarctobacterium_succinatutens: 52 hits
g__Ruminococcus.s__Ruminococcus_lactaris: 137 hits
g__Eubacterium.s__Eubacterium_hallii: 143 hits
g__Collinsella.s__Collinsella_aerofaciens: 118 hits
g__Bifidobacterium.s__Bifidobacterium_longum: 66 hits
g__Prevotella.s__Prevotella_sp_CAG_279: 55 hits
g__Slackia.s__Slackia_isoflavoniconvertens: 1 hits

Total gene families from nucleotide alignment: 4859

Unaligned reads after nucleotide alignment: 82.5135000000 %


Running diamond ........


Aligning to reference database: uniref90_201901b_full.dmnd

CRITICAL ERROR: Error executing: /home/microviable/miniconda3/envs/humann3/bin/diamond blastx --query /media/microviable/e/gdrive/Microviable/ShotgunMetagenomeRarefaction/humann/BURGOS023A_6_100K_humann_temp/BURGOS023A_6_100K_bowtie2_unaligned.fa --evalue 1.0 --threads 1 --top 1 --outfmt 6 --db /home/microviable/biobakery_workflows_databases/humann/uniref/uniref90_201901b_full --out /media/microviable/e/gdrive/Microviable/ShotgunMetagenomeRarefaction/humann/BURGOS023A_6_100K_humann_temp/tmp5u86x702/diamond_m8_8y85sxfh --tmpdir /media/microviable/e/gdrive/Microviable/ShotgunMetagenomeRarefaction/humann/BURGOS023A_6_100K_humann_temp/tmp5u86x702

Error message returned from diamond :
diamond v2.0.15.153 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 1
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: /media/microviable/e/gdrive/Microviable/ShotgunMetagenomeRarefaction/humann/BURGOS023A_6_100K_humann_temp/tmp5u86x702
Percentage range of top alignment score to report hits: 1
Opening the database...  [0.065s]
Database: /home/microviable/biobakery_workflows_databases/humann/uniref/uniref90_201901b_full.dmnd (type: Diamond database, sequences: 87296736, letters: 29247941583)
Block size = 2000000000
Opening the input file...  [0.04s]
Opening the output file...  [0s]
Loading query sequences...  [0.291s]
Masking queries...  [2.62s]
Algorithm: Double-indexed
Building query histograms...  [1.931s]
Allocating buffers...  [0s]
Loading reference sequences...  [0.006s]
Error: Unexpected end of input.

A number of people have reported a similar problem with DIAMOND 2.0.15. Note that HUMAnN 3 was designed against DIAMOND 0.9.36. It appears as though DIAMOND 2.0 was backwards compatible with our databases for a while, but that has either changed OR there is a general bug in 2.0.15 causing problems.

I have the same problem with Diamond 0.9.36

some log info

But I need to say that input file wery large ( 82Gb fastq). There are no problem with disk in my workspace. Maybe problem with memory?