HI, Laurance:
I reinstall the biobakery3 environment and update the metapblan into 3.0.2. Following your suggestion, this time i just open one terminal for Humann program for one sample only. However, the same outcomes came out from terminal and sample log.
I also attached initial setting:
08/05/2020 12:55:11 PM - humann.humann - INFO: Running humann v3.0.0.alpha.3
08/05/2020 12:55:11 PM - humann.humann - INFO: Output files will be written to: /home/chaozhi/anaconda3/envs/biobakery3/working_file/BSF1_meta
08/05/2020 12:55:11 PM - humann.humann - INFO: Writing temp files to directory: /home/chaozhi/anaconda3/envs/biobakery3/working_file/BSF1_meta/BSF1_nohost_humann_temp
08/05/2020 12:55:11 PM - humann.utilities - INFO: File ( /home/chaozhi/anaconda3/envs/biobakery3/working_file/BSF1_nohost.fastq ) is of format: fastq
08/05/2020 12:55:11 PM - humann.utilities - DEBUG: Check software, metaphlan, for required version, 3.0
08/05/2020 12:55:16 PM - humann.utilities - INFO: Using metaphlan version 3.0
08/05/2020 12:55:16 PM - humann.utilities - DEBUG: Check software, bowtie2, for required version, 2.2
08/05/2020 12:55:17 PM - humann.utilities - WARNING: Can not call software version for bowtie2
08/05/2020 12:55:17 PM - humann.utilities - INFO: Using bowtie2 version UNK
08/05/2020 12:55:17 PM - humann.humann - INFO: Search mode set to uniref90 because a uniref90 translated search database is selected
08/05/2020 12:55:17 PM - humann.utilities - DEBUG: Check software, diamond, for required version, 0.9.24
08/05/2020 12:55:17 PM - humann.utilities - INFO: Using diamond version 2.0.1
08/05/2020 12:55:17 PM - humann.config - INFO:
Run config settings:
DATABASE SETTINGS
nucleotide database folder = /home/chaozhi/anaconda3/envs/biobakery3/chocophlan
protein database folder = /home/chaozhi/anaconda3/envs/biobakery3/uniref
pathways database file 1 = /home/chaozhi/anaconda3/envs/biobakery3/lib/python3.7/site-packages/humann/data/pathways/metacyc_reactions_level4ec_only.uniref.bz2
pathways database file 2 = /home/chaozhi/anaconda3/envs/biobakery3/lib/python3.7/site-packages/humann/data/pathways/metacyc_pathways_structured_filtered
utility mapping database folder = /home/chaozhi/anaconda3/envs/biobakery3/utility_mapping
RUN MODES
resume = False
verbose = False
bypass prescreen = False
bypass nucleotide index = False
bypass nucleotide search = False
bypass translated search = False
translated search = diamond
pick frames = off
threads = 8
SEARCH MODE
search mode = uniref90
nucleotide identity threshold = 0.0
translated identity threshold = 80.0
ALIGNMENT SETTINGS
bowtie2 options = --very-sensitive
diamond options = --top 1 --outfmt 6
evalue threshold = 1.0
prescreen threshold = 0.01
translated subject coverage threshold = 50.0
translated query coverage threshold = 90.0
nucleotide subject coverage threshold = 50.0
nucleotide query coverage threshold = 90.0
PATHWAYS SETTINGS
minpath = on
xipe = off
gap fill = on
INPUT AND OUTPUT FORMATS
input file format = fastq
output file format = tsv
output max decimals = 10
remove stratified output = False
remove column description output = False
log level = DEBUG
After finish taxonomic part, from terminal:
Total species selected from prescreen: 101
Selected species explain 99.94% of predicted community composition
Creating custom ChocoPhlAn database …
Running bowtie2-build …
Running bowtie2 …
Killed
From sample log:
08/05/2020 02:31:45 PM - humann.humann - INFO: TIMESTAMP: Completed nucleotide alignment : 2252 seconds
08/05/2020 02:57:34 PM - humann.utilities - DEBUG: Total alignments where percent identity is not a number: 0
08/05/2020 02:57:34 PM - humann.utilities - DEBUG: Total alignments where alignment length is not a number: 0
08/05/2020 02:57:34 PM - humann.utilities - DEBUG: Total alignments where E-value is not a number: 0
08/05/2020 02:57:34 PM - humann.utilities - DEBUG: Total alignments not included based on large e-value: 0
08/05/2020 02:57:34 PM - humann.utilities - DEBUG: Total alignments not included based on small percent identity: 0
08/05/2020 02:57:34 PM - humann.utilities - DEBUG: Total alignments not included based on small query coverage: 0
08/05/2020 02:57:34 PM - humann.search.blastx_coverage - INFO: Total alignments without coverage information: 0
08/05/2020 02:57:34 PM - humann.search.blastx_coverage - INFO: Total proteins in blastx output: 275978
08/05/2020 02:57:34 PM - humann.search.blastx_coverage - INFO: Total proteins without lengths: 0
08/05/2020 02:57:34 PM - humann.search.blastx_coverage - INFO: Proteins with coverage greater than threshold (50.0): 167876
Since my linux run in the virtual machine so one possible reason the high usage of memory by virtual machine. Besides, i can not figure out the potential reason. I am looking forward to your reply.
Best Regards,
Chaozhi Pan