Kneaddata fails with message: Killed

It’s my first time using Kneaddata and I’m not sure about the memory requirements for paired end shotgun data. I have tried installs using conda and pip into a conda environment, both result in the same error. I’m not sure if I’m just missing something obvious.

Here is the full log file:

05/10/2024 02:55:22 PM - kneaddata.knead_data - INFO: Running kneaddata v0.12.0
05/10/2024 02:55:22 PM - kneaddata.knead_data - INFO: Output files will be written to: /home/biomineruser/shotgun_pipeline_test/kneaddata_output/SU03052328
05/10/2024 02:55:22 PM - kneaddata.knead_data - DEBUG: Running with the following arguments: 
verbose = False
input1 = SU03052328/SU03052328_S6_R1_001.fastq.gz
input2 = SU03052328/SU03052328_S6_R2_001.fastq.gz
unpaired = None
output_dir = /home/biomineruser/shotgun_pipeline_test/kneaddata_output/SU03052328
scratch_dir = 
reference_db = /home/biomineruser/shotgun_pipeline_test/kneaddata_humane_genome/hg37dec_v0.1
bypass_trim = False
output_prefix = SU03052328_S6_R1_001_kneaddata
threads = 2
processes = 1
trimmomatic_quality_scores = -phred33
bmtagger = False
bypass_trf = False
run_trf = False
fastqc_start = False
fastqc_end = False
store_temp_output = False
remove_intermediate_output = False
cat_final_output = False
log_level = DEBUG
log = /home/biomineruser/shotgun_pipeline_test/kneaddata_output/SU03052328/SU03052328_S6_R1_001_kneaddata.log
trimmomatic_path = /home/biomineruser/Trimmomatic-0.33/trimmomatic-0.33.jar
run_trim_repetitive = False
max_memory = 500m
trimmomatic_options = None
sequencer_source = NexteraPE
bowtie2_path = /home/biomineruser/bowtie2-2.4.2-sra-linux-x86_64/bowtie2
bowtie2_options = --very-sensitive-local --phred33
decontaminate_pairs = strict
reorder = False
serial = False
bmtagger_path = None
trf_path = /home/biomineruser/miniconda3/envs/shotgun_knead_pipeline/bin/trf
match = 2
mismatch = 7
delta = 7
pm = 80
pi = 10
minscore = 50
maxperiod = 500
fastqc_path = None
remove_temp_output = True
input = /home/biomineruser/shotgun_pipeline_test/SU03052328/SU03052328_S6_R1_001.fastq.gz /home/biomineruser/shotgun_pipeline_test/SU03052328/SU03052328_S6_R2_001.fastq.gz
discordant = True

05/10/2024 02:55:22 PM - kneaddata.utilities - INFO: Decompressing gzipped file ...
05/10/2024 02:58:33 PM - kneaddata.utilities - INFO: Decompressed file created: /home/biomineruser/shotgun_pipeline_test/kneaddata_output/SU03052328/decompressed_2wjvqa1h_SU03052328_S6_R1_001.fastq
05/10/2024 02:58:33 PM - kneaddata.utilities - INFO: Decompressing gzipped file ...
05/10/2024 03:01:41 PM - kneaddata.utilities - INFO: Decompressed file created: /home/biomineruser/shotgun_pipeline_test/kneaddata_output/SU03052328/decompressed_f64tjl_d_SU03052328_S6_R2_001.fastq
05/10/2024 03:01:41 PM - kneaddata.utilities - INFO: Reformatting file sequence identifiers ...
05/10/2024 03:04:51 PM - kneaddata.utilities - INFO: Reformatting file sequence identifiers ...
05/10/2024 03:08:01 PM - kneaddata.utilities - INFO: Reordering read identifiers ...

And the messages outputted by the kneaddata command is just Killed suggesting an issue with resource availability (SIGKILL).

Here is the command:

kneaddata --input1 SU03052328/SU03052328_S6_R1_001.fastq.gz --input2 SU03052328/SU03052328_S6_R2_001.fastq.gz -db kneaddata_humane_genome/ --output kneaddata_output/SU03052328 -p 4 -t 8 --trimmomatic ~/Trimmomatic-0.33 --bowtie2 ~/bowtie2-2.4.2-sra-linux-x86_64/

The input fastq files are concatenated paired end files from 2 lanes from Illumina Novaseq6000 (Lane 1 R1 and Lane 2 R1 files concatenated, same for R2 files).

I’ve had to point it to the trimmomatic and bowtie2 directories because installing with pip didn’t automatically install those two (like the documentation said it would which was weird). Each file is approximately 6Gb.

I am running this on an 8 core 32gb VM with a 2TB disk (ssd).

Monitoring resource usage with htop during the execution, I can see the memory usage hits 100% when reordering happens. I haven’t been able to find too much info about the resource requirements of kneaddata other than the Memory (>= 4 Gb if using Bowtie2, >= 8 Gb if using BMTagger) on the install page.

Any pointers or additional debugging suggestions most welcome, even just “use a bigger vm” with reason.