The bioBakery help forum

Humann3 output logs: total bugs

Hi! I’m trying to understand the HUMAnN3 printouts.

I see there’s now sections like the portion below that refer to hits. Are these hits “reads”?
If not, how can I get read counts, instead of RPKs from HUMAnN3?

Running metaphlan …
Found g__Parabacteroides.s__Parabacteroides_goldsteinii : 73.88% of mapped reads
Found g__Clostridium.s__Clostridium_sp_ASF356 : 15.14% of mapped reads
Found g__Blautia.s__Blautia_coccoides : 5.54% of mapped reads
Found g__Mucispirillum.s__Mucispirillum_schaedleri : 5.11% of mapped reads
Found g__Lactobacillus.s__Lactobacillus_murinus : 0.33% of mapped reads
Total species selected from prescreen: 5
Selected species explain 100.00% of predicted community composition
Creating custom ChocoPhlAn database …
Running bowtie2-build …
Running bowtie2 …
Total bugs from nucleotide alignment: 5
g__Parabacteroides.s__Parabacteroides_goldsteinii: 733486 hits
g__Clostridium.s__Clostridium_sp_ASF356: 67630 hits
g__Blautia.s__Blautia_coccoides: 66534 hits
g__Mucispirillum.s__Mucispirillum_schaedleri: 9038 hits
g__Lactobacillus.s__Lactobacillus_murinus: 64 hits
Total gene families from nucleotide alignment: 9071
Unaligned reads after nucleotide alignment: 85.0070015157 %
Running diamond …
Aligning to reference database: uniref90_201901.dmnd
Total bugs after translated alignment: 6
g__Parabacteroides.s__Parabacteroides_goldsteinii: 733486 hits
g__Clostridium.s__Clostridium_sp_ASF356: 67630 hits
g__Blautia.s__Blautia_coccoides: 66534 hits
g__Mucispirillum.s__Mucispirillum_schaedleri: 9038 hits
g__Lactobacillus.s__Lactobacillus_murinus: 64 hits
unclassified: 5418 hits
Total gene families after translated alignment: 9187
Unaligned reads after translated alignment: 84.8843132833 %
Computing gene families …
Computing pathways abundance and coverage …

Full log below.
These reads don’t add up for a full accounting. What’s going on?

12/21/2020 04:24:49 PM - humann.humann - INFO: Running humann v3.0.0.alpha.4
12/21/2020 04:24:49 PM - humann.humann - INFO: Output files will be written to: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn
12/21/2020 04:24:49 PM - humann.humann - INFO: Writing temp files to directory: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp
12/21/2020 04:24:49 PM - humann.utilities - INFO: File ( /scratch/j/jparkin/billyc59/kneaddata_run/cleaned_NOD/NOD503CecMN_r1_new_kneaddata_unmatched_1.fastq ) is of format: fastq
12/21/2020 04:24:50 PM - humann.utilities - DEBUG: Check software, metaphlan, for required version, 3.0
12/21/2020 04:24:52 PM - humann.utilities - INFO: Using metaphlan version 3.0
12/21/2020 04:24:52 PM - humann.utilities - DEBUG: Check software, bowtie2, for required version, 2.2
12/21/2020 04:24:54 PM - humann.utilities - INFO: Using bowtie2 version 2.2
12/21/2020 04:24:54 PM - humann.humann - INFO: Search mode set to uniref90 because a uniref90 translated search database is selected
12/21/2020 04:24:54 PM - humann.utilities - DEBUG: Check software, diamond, for required version, 0.9.24
12/21/2020 04:24:54 PM - humann.utilities - INFO: Using diamond version 0.9.24
12/21/2020 04:24:54 PM - humann.config - INFO:
Run config settings:

DATABASE SETTINGS
nucleotide database folder = /scratch/j/jparkin/billyc59/h3_db/chocophlan
protein database folder = /scratch/j/jparkin/billyc59/h3_db/uniref
pathways database file 1 = /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/data/pathways/metacyc_reactions_level4ec_only.uniref.bz2
pathways database file 2 = /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/data/pathways/metacyc_pathways_structured_filtered
utility mapping database folder = /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/data/misc

RUN MODES
resume = False
verbose = False
bypass prescreen = False
bypass nucleotide index = False
bypass nucleotide search = False
bypass translated search = False
translated search = diamond
pick frames = off
threads = 40

SEARCH MODE
search mode = uniref90
nucleotide identity threshold = 0.0
translated identity threshold = 80.0

ALIGNMENT SETTINGS
bowtie2 options = --very-sensitive
diamond options = --top 1 --outfmt 6
evalue threshold = 1.0
prescreen threshold = 0.01
translated subject coverage threshold = 50.0
translated query coverage threshold = 90.0
nucleotide subject coverage threshold = 50.0
nucleotide query coverage threshold = 90.0

PATHWAYS SETTINGS
minpath = on
xipe = off
gap fill = on

INPUT AND OUTPUT FORMATS
input file format = fastq
output file format = tsv
output max decimals = 10
remove stratified output = False
remove column description output = False
log level = DEBUG

12/21/2020 04:24:54 PM - humann.store - DEBUG: Initialize Alignments class instance to minimize memory use
12/21/2020 04:24:54 PM - humann.store - DEBUG: Initialize Reads class instance to minimize memory use
12/21/2020 04:25:12 PM - humann.humann - INFO: Load pathways database part 1: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/data/pathways/metacyc_reactions_level4ec_only.uniref.bz2
12/21/2020 04:25:12 PM - humann.humann - INFO: Load pathways database part 2: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/data/pathways/metacyc_pathways_structured_filtered
12/21/2020 04:25:12 PM - humann.search.prescreen - INFO: Running metaphlan …
12/21/2020 04:25:12 PM - humann.utilities - DEBUG: Using software: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/metaphlan
12/21/2020 04:25:12 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_metaphlan_bugs_list.tsv
12/21/2020 04:25:12 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_metaphlan_bowtie2.txt
12/21/2020 04:25:12 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/metaphlan /scratch/j/jparkin/billyc59/kneaddata_run/cleaned_NOD/NOD503CecMN_r1_new_kneaddata_unmatched_1.fastq -t rel_ab -o /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_metaphlan_bugs_list.tsv --input_type fastq --bowtie2out /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_metaphlan_bowtie2.txt --nproc 40
12/21/2020 04:28:54 PM - humann.utilities - DEBUG: b’Warning! Biom python library not detected!\n Exporting to biom format will not work!\nWARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.\nAn additional column listing the merged species is added to the MetaPhlAn output.\nWARNING: It seems that you do not have Internet access.\nWARNING: Cannot connect to the database server. The latest available local database will be used.\n’
12/21/2020 04:28:54 PM - humann.humann - INFO: TIMESTAMP: Completed prescreen : 222 seconds
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Found g__Parabacteroides.s__Parabacteroides_goldsteinii : 73.88% of mapped reads
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Found g__Clostridium.s__Clostridium_sp_ASF356 : 15.14% of mapped reads
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Found g__Blautia.s__Blautia_coccoides : 5.54% of mapped reads
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Found g__Mucispirillum.s__Mucispirillum_schaedleri : 5.11% of mapped reads
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Found g__Lactobacillus.s__Lactobacillus_murinus : 0.33% of mapped reads
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Total species selected from prescreen: 5
12/21/2020 04:28:54 PM - humann.search.prescreen - DEBUG: Adding file to database: g__Clostridium.s__Clostridium_sp_ASF356.centroids.v296_201901.ffn.gz
12/21/2020 04:28:54 PM - humann.search.prescreen - DEBUG: Adding file to database: g__Lactobacillus.s__Lactobacillus_murinus.centroids.v296_201901.ffn.gz
12/21/2020 04:28:54 PM - humann.search.prescreen - DEBUG: Adding file to database: g__Mucispirillum.s__Mucispirillum_schaedleri.centroids.v296_201901.ffn.gz
12/21/2020 04:28:54 PM - humann.search.prescreen - DEBUG: Adding file to database: g__Blautia.s__Blautia_coccoides.centroids.v296_201901.ffn.gz
12/21/2020 04:28:54 PM - humann.search.prescreen - DEBUG: Adding file to database: g__Parabacteroides.s__Parabacteroides_goldsteinii.centroids.v296_201901.ffn.gz
12/21/2020 04:28:54 PM - humann.search.prescreen - INFO: Creating custom ChocoPhlAn database …
12/21/2020 04:28:54 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_custom_chocophlan_database.ffn
12/21/2020 04:28:54 PM - humann.utilities - DEBUG: Using software: /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin/gunzip
12/21/2020 04:28:54 PM - humann.utilities - INFO: Execute command: /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin/gunzip -c /scratch/j/jparkin/billyc59/h3_db/chocophlan/g__Clostridium.s__Clostridium_sp_ASF356.centroids.v296_201901.ffn.gz /scratch/j/jparkin/billyc59/h3_db/chocophlan/g__Lactobacillus.s__Lactobacillus_murinus.centroids.v296_201901.ffn.gz /scratch/j/jparkin/billyc59/h3_db/chocophlan/g__Mucispirillum.s__Mucispirillum_schaedleri.centroids.v296_201901.ffn.gz /scratch/j/jparkin/billyc59/h3_db/chocophlan/g__Blautia.s__Blautia_coccoides.centroids.v296_201901.ffn.gz /scratch/j/jparkin/billyc59/h3_db/chocophlan/g__Parabacteroides.s__Parabacteroides_goldsteinii.centroids.v296_201901.ffn.gz
12/21/2020 04:28:55 PM - humann.humann - INFO: TIMESTAMP: Completed custom database creation : 1 seconds
12/21/2020 04:28:55 PM - humann.search.nucleotide - INFO: Running bowtie2-build …
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Using software: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/bowtie2-build
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index.1.bt2
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index.2.bt2
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index.3.bt2
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index.4.bt2
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index.rev.1.bt2
12/21/2020 04:28:55 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index.rev.2.bt2
12/21/2020 04:28:55 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/bowtie2-build -f /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_custom_chocophlan_database.ffn /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index
12/21/2020 04:29:19 PM - humann.humann - INFO: TIMESTAMP: Completed database index : 24 seconds
12/21/2020 04:29:19 PM - humann.search.nucleotide - DEBUG: Nucleotide input file is of type: fastq
12/21/2020 04:29:19 PM - humann.utilities - DEBUG: Using software: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/bowtie2
12/21/2020 04:29:19 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_aligned.sam
12/21/2020 04:29:19 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/bowtie2 -q -x /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_index -U /scratch/j/jparkin/billyc59/kneaddata_run/cleaned_NOD/NOD503CecMN_r1_new_kneaddata_unmatched_1.fastq -S /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_aligned.sam -p 40 --very-sensitive
12/21/2020 04:29:40 PM - humann.utilities - DEBUG: b’5237875 reads; of these:\n 5237875 (100.00%) were unpaired; of these:\n 4296874 (82.03%) aligned 0 times\n 885884 (16.91%) aligned exactly 1 time\n 55117 (1.05%) aligned >1 times\n17.97% overall alignment rate\n’
12/21/2020 04:29:40 PM - humann.humann - INFO: TIMESTAMP: Completed nucleotide alignment : 21 seconds
12/21/2020 04:30:43 PM - humann.utilities - DEBUG: Total alignments where percent identity is not a number: 0
12/21/2020 04:30:43 PM - humann.utilities - DEBUG: Total alignments where alignment length is not a number: 0
12/21/2020 04:30:43 PM - humann.utilities - DEBUG: Total alignments where E-value is not a number: 0
12/21/2020 04:30:43 PM - humann.utilities - DEBUG: Total alignments not included based on large e-value: 0
12/21/2020 04:30:43 PM - humann.utilities - DEBUG: Total alignments not included based on small percent identity: 0
12/21/2020 04:30:43 PM - humann.utilities - DEBUG: Total alignments not included based on small query coverage: 0
12/21/2020 04:30:43 PM - humann.search.blastx_coverage - INFO: Total alignments without coverage information: 0
12/21/2020 04:30:43 PM - humann.search.blastx_coverage - INFO: Total proteins in blastx output: 14716
12/21/2020 04:30:43 PM - humann.search.blastx_coverage - INFO: Total proteins without lengths: 0
12/21/2020 04:30:43 PM - humann.search.blastx_coverage - INFO: Proteins with coverage greater than threshold (50.0): 9071
12/21/2020 04:32:00 PM - humann.search.nucleotide - DEBUG: Total nucleotide alignments not included based on filtered genes: 64249
12/21/2020 04:32:00 PM - humann.search.nucleotide - DEBUG: Total nucleotide alignments not included based on small percent identities: 0
12/21/2020 04:32:00 PM - humann.search.nucleotide - DEBUG: Total nucleotide alignments not included based on query coverage threshold: 0
12/21/2020 04:32:00 PM - humann.search.nucleotide - DEBUG: Keeping sam file
12/21/2020 04:32:00 PM - humann.humann - INFO: TIMESTAMP: Completed nucleotide alignment post-processing : 140 seconds
12/21/2020 04:32:00 PM - humann.humann - INFO: Total bugs from nucleotide alignment: 5
12/21/2020 04:32:00 PM - humann.humann - INFO:
g__Parabacteroides.s__Parabacteroides_goldsteinii: 733486 hits
g__Clostridium.s__Clostridium_sp_ASF356: 67630 hits
g__Blautia.s__Blautia_coccoides: 66534 hits
g__Mucispirillum.s__Mucispirillum_schaedleri: 9038 hits
g__Lactobacillus.s__Lactobacillus_murinus: 64 hits
12/21/2020 04:32:00 PM - humann.humann - INFO: Total gene families from nucleotide alignment: 9071
12/21/2020 04:32:00 PM - humann.humann - INFO: Unaligned reads after nucleotide alignment: 85.0070015157 %
12/21/2020 04:32:00 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_diamond_aligned.tsv
12/21/2020 04:32:00 PM - humann.search.translated - INFO: Running diamond …
12/21/2020 04:32:00 PM - humann.search.translated - INFO: Aligning to reference database: uniref90_201901.dmnd
12/21/2020 04:32:00 PM - humann.utilities - DEBUG: Remove file: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/diamond_m8_9emeeup5
12/21/2020 04:32:00 PM - humann.utilities - DEBUG: Using software: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/diamond
12/21/2020 04:32:00 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/diamond blastx --query /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/NOD503CecMN_r1_new_kneaddata_unmatched_1_bowtie2_unaligned.fa --evalue 1.0 --threads 40 --top 1 --outfmt 6 --db /scratch/j/jparkin/billyc59/h3_db/uniref/uniref90_201901 --out /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/diamond_m8_9emeeup5 --tmpdir /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g
12/21/2020 04:33:19 PM - humann.utilities - DEBUG: b’diamond v0.9.24.125 | by Benjamin Buchfink buchfink@gmail.com\nLicensed under the GNU GPL https://www.gnu.org/licenses/gpl.txt\nCheck http://github.com/bbuchfink/diamond for updates.\n\n#CPU threads: 40\nScoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)\nTemporary directory: /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g\nOpening the database… [0.048518s]\nPercentage range of top alignment score to report hits: 1\nOpening the input file… [4.8e-05s]\nOpening the output file… [0.00016s]\nLoading query sequences… [6.07824s]\nMasking queries… [9.25263s]\nBuilding query seed set… [0.096469s]\nAlgorithm: Double-indexed\nBuilding query histograms… [0.332385s]\nAllocating buffers… [0.00065s]\nLoading reference sequences… [6.78216s]\nBuilding reference histograms… [0.93041s]\nAllocating buffers… [0.000143s]\nInitializing temporary storage… [0.30558s]\nProcessing query chunk 0, reference chunk 0, shape 0, index chunk 0.\nBuilding reference index… [1.32846s]\nBuilding query index… [0.228905s]\nBuilding seed filter… [0.078927s]\nSearching alignments… [0.976783s]\nProcessing query chunk 0, reference chunk 0, shape 0, index chunk 1.\nBuilding reference index… [1.3904s]\nBuilding query index… [0.288476s]\nBuilding seed filter… [0.078824s]\nSearching alignments… [0.207263s]\nProcessing query chunk 0, reference chunk 0, shape 0, index chunk 2.\nBuilding reference index… [1.47163s]\nBuilding query index… [0.303348s]\nBuilding seed filter… [0.079271s]\nSearching alignments… [0.212712s]\nProcessing query chunk 0, reference chunk 0, shape 0, index chunk 3.\nBuilding reference index… [1.28992s]\nBuilding query index… [0.222088s]\nBuilding seed filter… [0.078896s]\nSearching alignments… [0.222324s]\nProcessing query chunk 0, reference chunk 0, shape 1, index chunk 0.\nBuilding reference index… [1.24914s]\nBuilding query index… [0.233302s]\nBuilding seed filter… [0.077608s]\nSearching alignments… [0.146752s]\nProcessing query chunk 0, reference chunk 0, shape 1, index chunk 1.\nBuilding reference index… [1.39514s]\nBuilding query index… [0.250647s]\nBuilding seed filter… [0.077068s]\nSearching alignments… [0.149504s]\nProcessing query chunk 0, reference chunk 0, shape 1, index chunk 2.\nBuilding reference index… [1.46274s]\nBuilding query index… [0.24851s]\nBuilding seed filter… [0.077118s]\nSearching alignments… [0.142223s]\nProcessing query chunk 0, reference chunk 0, shape 1, index chunk 3.\nBuilding reference index… [1.35164s]\nBuilding query index… [0.233226s]\nBuilding seed filter… [0.077712s]\nSearching alignments… [0.15188s]\nDeallocating buffers… [0.007416s]\nComputing alignments… [39.8387s]\nDeallocating reference… [0.004772s]\nLoading reference sequences… [0.000397s]\nDeallocating buffers… [0.001428s]\nDeallocating queries… [0.004989s]\nLoading query sequences… [2.5e-05s]\nClosing the input file… [2.1e-05s]\nClosing the output file… [0.000155s]\nClosing the database file… [2e-05s]\nDeallocating taxonomy… [3e-06s]\nTotal time = 79.4698s\nReported 214621 pairwise alignments, 214659 HSPs.\n99556 queries aligned.\n’
12/21/2020 04:33:19 PM - humann.utilities - DEBUG: Using software: /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin/cat
12/21/2020 04:33:19 PM - humann.utilities - INFO: Execute command: /cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin/cat /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/diamond_m8_9emeeup5
12/21/2020 04:33:20 PM - humann.humann - INFO: TIMESTAMP: Completed translated alignment : 80 seconds
12/21/2020 04:33:22 PM - humann.utilities - DEBUG: Total alignments where percent identity is not a number: 0
12/21/2020 04:33:22 PM - humann.utilities - DEBUG: Total alignments where alignment length is not a number: 0
12/21/2020 04:33:22 PM - humann.utilities - DEBUG: Total alignments where E-value is not a number: 0
12/21/2020 04:33:22 PM - humann.utilities - DEBUG: Total alignments not included based on large e-value: 17
12/21/2020 04:33:22 PM - humann.utilities - DEBUG: Total alignments not included based on small percent identity: 30213
12/21/2020 04:33:22 PM - humann.utilities - DEBUG: Total alignments not included based on small query coverage: 126996
12/21/2020 04:33:22 PM - humann.search.blastx_coverage - INFO: Total alignments without coverage information: 0
12/21/2020 04:33:22 PM - humann.search.blastx_coverage - INFO: Total proteins in blastx output: 18349
12/21/2020 04:33:22 PM - humann.search.blastx_coverage - INFO: Total proteins without lengths: 0
12/21/2020 04:33:22 PM - humann.search.blastx_coverage - INFO: Proteins with coverage greater than threshold (50.0): 120
12/21/2020 04:33:24 PM - humann.utilities - DEBUG: Total alignments where percent identity is not a number: 0
12/21/2020 04:33:24 PM - humann.utilities - DEBUG: Total alignments where alignment length is not a number: 0
12/21/2020 04:33:24 PM - humann.utilities - DEBUG: Total alignments where E-value is not a number: 0
12/21/2020 04:33:24 PM - humann.utilities - DEBUG: Total alignments not included based on large e-value: 17
12/21/2020 04:33:24 PM - humann.utilities - DEBUG: Total alignments not included based on small percent identity: 30213
12/21/2020 04:33:24 PM - humann.utilities - DEBUG: Total alignments not included based on small query coverage: 126996
12/21/2020 04:33:24 PM - humann.search.translated - DEBUG: Total translated alignments not included based on small subject coverage value: 61831
12/21/2020 04:33:43 PM - humann.humann - INFO: TIMESTAMP: Completed translated alignment post-processing : 23 seconds
12/21/2020 04:33:43 PM - humann.humann - INFO: Total bugs after translated alignment: 6
12/21/2020 04:33:43 PM - humann.humann - INFO:
g__Parabacteroides.s__Parabacteroides_goldsteinii: 733486 hits
g__Clostridium.s__Clostridium_sp_ASF356: 67630 hits
g__Blautia.s__Blautia_coccoides: 66534 hits
g__Mucispirillum.s__Mucispirillum_schaedleri: 9038 hits
g__Lactobacillus.s__Lactobacillus_murinus: 64 hits
unclassified: 5418 hits
12/21/2020 04:33:43 PM - humann.humann - INFO: Total gene families after translated alignment: 9187
12/21/2020 04:33:43 PM - humann.humann - INFO: Unaligned reads after translated alignment: 84.8843132833 %
12/21/2020 04:33:43 PM - humann.humann - INFO: Computing gene families …
12/21/2020 04:33:43 PM - humann.quantify.families - DEBUG: Compute gene families
12/21/2020 04:33:45 PM - humann.store - INFO:
Total gene families : 9187
g__Parabacteroides.s__Parabacteroides_goldsteinii : 5368 gene families
g__Clostridium.s__Clostridium_sp_ASF356 : 1429 gene families
g__Blautia.s__Blautia_coccoides : 1941 gene families
g__Mucispirillum.s__Mucispirillum_schaedleri : 324 gene families
g__Lactobacillus.s__Lactobacillus_murinus : 12 gene families
unclassified : 120 gene families
12/21/2020 04:33:59 PM - humann.humann - INFO: TIMESTAMP: Completed computing gene families : 16 seconds
12/21/2020 04:33:59 PM - humann.humann - INFO: Computing pathways abundance and coverage …
12/21/2020 04:33:59 PM - humann.quantify.modules - DEBUG: Write flat reactions to pathways file for Minpath
12/21/2020 04:33:59 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: g__Parabacteroides.s__Parabacteroides_goldsteinii
12/21/2020 04:34:01 PM - humann.quantify.modules - INFO: Run MinPath on g__Parabacteroides.s__Parabacteroides_goldsteinii
12/21/2020 04:34:01 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: g__Clostridium.s__Clostridium_sp_ASF356
12/21/2020 04:34:03 PM - humann.quantify.modules - INFO: Run MinPath on g__Clostridium.s__Clostridium_sp_ASF356
12/21/2020 04:34:03 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: g__Blautia.s__Blautia_coccoides
12/21/2020 04:34:04 PM - humann.quantify.modules - INFO: Run MinPath on g__Blautia.s__Blautia_coccoides
12/21/2020 04:34:04 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: g__Mucispirillum.s__Mucispirillum_schaedleri
12/21/2020 04:34:06 PM - humann.quantify.modules - INFO: Run MinPath on g__Mucispirillum.s__Mucispirillum_schaedleri
12/21/2020 04:34:06 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: g__Lactobacillus.s__Lactobacillus_murinus
12/21/2020 04:34:07 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: unclassified
12/21/2020 04:34:08 PM - humann.quantify.modules - INFO: Run MinPath on unclassified
12/21/2020 04:34:08 PM - humann.quantify.modules - INFO: Compute reaction scores for bug: all
12/21/2020 04:34:10 PM - humann.quantify.modules - INFO: Run MinPath on all
12/21/2020 04:34:10 PM - humann.utilities - DEBUG: Using python module : /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py
12/21/2020 04:34:10 PM - humann.utilities - DEBUG: Using python module : /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py
12/21/2020 04:34:10 PM - humann.utilities - DEBUG: Using python module : /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py
12/21/2020 04:34:10 PM - humann.utilities - DEBUG: Using python module : /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py
12/21/2020 04:34:10 PM - humann.utilities - DEBUG: Using python module : /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py
12/21/2020 04:34:10 PM - humann.utilities - DEBUG: Using python module : /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py
12/21/2020 04:34:10 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/python /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py -any /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmp_9s0t4me -map /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmp1foe8dg8 -report /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmp959ocj0u -details /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpg12rjf5n -mps /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpqohag_dn
12/21/2020 04:34:10 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/python /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py -any /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmp4uu1sp8b -map /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmp1foe8dg8 -report /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmps025ntok -details /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpdr9f345o -mps /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmplj5lc9dg
12/21/2020 04:34:10 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/python /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/lib/python3.8/site-packages/humann/quantify/MinPath12hmp.py -any /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpp5qlh8xm -map /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmp1foe8dg8 -report /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpxeis4tvo -details /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpopjin027 -mps /scratch/j/jparkin/billyc59/kneaddata_run/h3/NOD_no_bypass/503_cmn/NOD503CecMN_r1_new_kneaddata_unmatched_1_humann_temp/tmpc5enxr3g/tmpztghywjr
12/21/2020 04:34:10 PM - humann.utilities - INFO: Execute command: /gpfs/fs1/home/j/jparkin/billyc59/.virtualenvs/h3/bin/python

HUMAnN’s rawest outputs are RPK units; we don’t have a mode to output integer counts because we allow reads to map to multiple genes (and then divide the read’s weight over those genes). The “hits” you are seeing in the log files are raw counts of alignments of reads by bowtie2 and diamond. A read can in principle participate in multiple hits that HUMAnN counts, especially in the translated search phase (diamond).

1 Like