Hello,
I have some mouse stool metagenome samples (~10M reads per sample). I preprocessed the samples with kneaddata and then ran through humann3. The unaligned reads after nucleotide alignment were 88.7% and 80.2% after translated alignment. I posted the translated alignment output below.
I went through the bowtie2_unaligned.fa file and blasted a handful of sequences. Many of them found microbial hits on blast (e.g., lachnospiraceae, muribaculum).
Is there a way to verify I am doing this correctly. And if so, are there any other ways to classify the other 80% of the reads in these samples?
Best,
Jacob
08/05/2021 06:51:32 PM - humann.utilities - INFO: Execute command: /bin/cat /home/lab_user/microbiome_files/SE7259/FT-SA93359/kneaddata_output/humann/SE7259_SA93359_S9_kneaddata_combined_humann_temp/tmp_qipdi2t/diamond_m8_jfq0xudm
08/05/2021 06:51:35 PM - humann.humann - INFO: TIMESTAMP: Completed 	translated alignment 	:	 8487	 seconds
08/05/2021 06:56:02 PM - humann.utilities - DEBUG: Total alignments where percent identity is not a number: 0
08/05/2021 06:56:02 PM - humann.utilities - DEBUG: Total alignments where alignment length is not a number: 0
08/05/2021 06:56:02 PM - humann.utilities - DEBUG: Total alignments where E-value is not a number: 0
08/05/2021 06:56:02 PM - humann.utilities - DEBUG: Total alignments not included based on large e-value: 0
08/05/2021 06:56:02 PM - humann.utilities - DEBUG: Total alignments not included based on small percent identity: 2748634
08/05/2021 06:56:02 PM - humann.utilities - DEBUG: Total alignments not included based on small query coverage: 1407002
08/05/2021 06:56:23 PM - humann.search.blastx_coverage - INFO: Total alignments without coverage information: 0
08/05/2021 06:56:23 PM - humann.search.blastx_coverage - INFO: Total proteins in blastx output: 719474
08/05/2021 06:56:23 PM - humann.search.blastx_coverage - INFO: Total proteins without lengths: 0
08/05/2021 06:56:23 PM - humann.search.blastx_coverage - INFO: Proteins with coverage greater than threshold (50.0): 47960
08/05/2021 07:00:57 PM - humann.utilities - DEBUG: Total alignments where percent identity is not a number: 0
08/05/2021 07:00:57 PM - humann.utilities - DEBUG: Total alignments where alignment length is not a number: 0
08/05/2021 07:00:57 PM - humann.utilities - DEBUG: Total alignments where E-value is not a number: 0
08/05/2021 07:00:57 PM - humann.utilities - DEBUG: Total alignments not included based on large e-value: 0
08/05/2021 07:00:57 PM - humann.utilities - DEBUG: Total alignments not included based on small percent identity: 2748634
08/05/2021 07:00:57 PM - humann.utilities - DEBUG: Total alignments not included based on small query coverage: 1407002
08/05/2021 07:00:57 PM - humann.search.translated - DEBUG: Total translated alignments not included based on small subject coverage value: 4617279
08/05/2021 07:05:14 PM - humann.humann - INFO: TIMESTAMP: Completed 	translated alignment post-processing 	:	 820	 seconds
08/05/2021 07:05:14 PM - humann.humann - INFO: Total bugs after translated alignment: 27
08/05/2021 07:05:14 PM - humann.humann - INFO:
g__Lachnospiraceae_unclassified.s__Lachnospiraceae_bacterium_10_1: 188407 hits
g__Helicobacter.s__Helicobacter_typhlonius: 193124 hits
g__Bacteroides.s__Bacteroides_caecimuris: 117577 hits
g__Lachnospiraceae_unclassified.s__Lachnospiraceae_bacterium_COE1: 119192 hits
g__Muribaculaceae_unclassified.s__Muribaculaceae_bacterium_DSM_103720: 280656 hits
g__Muribaculum.s__Muribaculum_intestinale: 194922 hits
g__Lachnospiraceae_unclassified.s__Lachnospiraceae_bacterium_A2: 538415 hits
g__Firmicutes_unclassified.s__Firmicutes_bacterium_ASF500: 97123 hits
g__Bacteroides.s__Bacteroides_vulgatus: 99958 hits
g__Oscillibacter.s__Oscillibacter_sp_1_3: 123789 hits
g__Bacteroides.s__Bacteroides_sartorii: 181472 hits
g__Lachnospiraceae_unclassified.s__Lachnospiraceae_bacterium_3_1: 87535 hits
g__Bacteroides.s__Bacteroides_uniformis: 154944 hits
g__Dorea.s__Dorea_sp_5_2: 80392 hits
g__Parabacteroides.s__Parabacteroides_distasonis: 61290 hits
g__Lactobacillus.s__Lactobacillus_reuteri: 31306 hits
g__Anaerotruncus.s__Anaerotruncus_sp_G3_2012: 38340 hits
g__Clostridium.s__Clostridium_sp_ASF502: 63716 hits
g__Acutalibacter.s__Acutalibacter_muris: 20694 hits
g__Clostridium.s__Clostridium_sp_ASF356: 22462 hits
g__Lactobacillus.s__Lactobacillus_murinus: 8669 hits
g__Helicobacter.s__Helicobacter_apodemus: 12369 hits
g__Mucispirillum.s__Mucispirillum_schaedleri: 5446 hits
g__Lactobacillus.s__Lactobacillus_intestinalis: 16865 hits
g__Lactobacillus.s__Lactobacillus_johnsonii: 5246 hits
g__Enterorhabdus.s__Enterorhabdus_caecimuris: 388 hits
unclassified: 2421004 hits
08/05/2021 07:05:14 PM - humann.humann - INFO: Total gene families after translated alignment: 91351
08/05/2021 07:05:14 PM - humann.humann - INFO: Unaligned reads after translated alignment: 80.2395493667 %