Humann3 running indefinitely!

Hi,
I am using Hunmann3 on Linux computer for analyzing metagenomic sequence and it keeps running metaphlan indefinitely and no output is coming
I am not sure if there was something wrong with the settings or database folder.
I would appreciate your suggestions on this issue.

Thanks in advance
Mashuk

I have copied the command, database folder detail, human configuration, and log of the running commands below:

  1. Here is the command:
    (mpa)msiddiq7@enggpz1p23:/media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq$ humann --input 86-ELS_SME_L001_R1_001.fastq.gz --output MG24_ur50/USACE_86-ELS_R1_uniref50 --bypass-translated-search
    Creating output directory: /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50
    Output files will be written to: /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50
    Decompressing gzipped file …

Removing spaces from identifiers in input file …

Running metaphlan …


  1. Database folder details:

(base) msiddiq7@enggpz1p23:~/humann_dbs$ ls
chocophlan uniref50_201901b_ec_filtered.dmnd
uniref uniref90_201901b_full.dmnd

(base) msiddiq7@enggpz1p23:~/humann_dbs/uniref$ ls
uniref50_201901b_full.dmnd
(base) msiddiq7@enggpz1p23:~/humann_dbs/uniref$

(base) msiddiq7@enggpz1p23:~/utility_mapping$ ls
map_ec_name.txt.gz map_level4ec_uniref50.txt.gz
map_eggnog_name.txt.gz map_level4ec_uniref90.txt.gz
map_eggnog_uniref50.txt.gz map_pfam_name.txt.gz
map_eggnog_uniref90.txt.gz map_pfam_uniref50.txt.gz
map_go_name.txt.gz map_pfam_uniref90.txt.gz
map_go_uniref50.txt.gz map_uniref50_name.txt.bz2
map_go_uniref90.txt.gz map_uniref50_uniref90.txt.gz
map_ko_name.txt.gz map_uniref90_name.txt.bz2
map_ko_uniref50.txt.gz uniref50-tol-lca.dat.bz2
map_ko_uniref90.txt.gz uniref90-tol-lca.dat.bz2
(base) msiddiq7@enggpz1p23:~/utility_mapping$

  1. Humann3 settings (configuration) detail:
    (mpa) msiddiq7@enggpz1p23:~$ humann_config

HUMAnN Configuration ( Section : Name = Value )

database_folders : nucleotide = /home/utad/msiddiq7/humann_dbs/chocophlan

database_folders : protein = /home/utad/msiddiq7/humann_dbs/uniref

database_folders : utility_mapping = /home/utad/msiddiq7/utility_mapping

run_modes : resume = False

run_modes : verbose = False

run_modes : bypass_prescreen = False

run_modes : bypass_nucleotide_index = False

run_modes : bypass_nucleotide_search = False

run_modes : bypass_translated_search = False

run_modes : threads = 1

alignment_settings : evalue_threshold = 1.0

alignment_settings : prescreen_threshold = 0.01

alignment_settings : translated_subject_coverage_threshold = 50.0

alignment_settings : translated_query_coverage_threshold = 90.0

alignment_settings : nucleotide_subject_coverage_threshold = 50.0

alignment_settings : nucleotide_query_coverage_threshold = 90.0

output_format : output_max_decimals = 10

output_format : remove_stratified_output = False

output_format : remove_column_description_output = False

(mpa) msiddiq7@enggpz1p23:~$

  1. Log file of the running sequence:

06/23/2024 11:29:33 PM - humann.humann - INFO: Running humann v3.0.1
06/23/2024 11:29:33 PM - humann.humann - INFO: Output files will be written to: /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50
06/23/2024 11:29:33 PM - humann.humann - INFO: Writing temp files to directory: /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50/86-ELS_SME_L001_R1_001_humann_temp
06/23/2024 11:29:33 PM - humann.utilities - INFO: File ( /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/86-ELS_SME_L001_R1_001.fastq.gz ) is of format: fastq.gz
06/23/2024 11:29:33 PM - humann.utilities - INFO: Decompressing gzipped file …
06/23/2024 11:37:46 PM - humann.humann - INFO: Removing spaces from identifiers in input file
06/23/2024 11:46:33 PM - humann.utilities - DEBUG: Check software, metaphlan, for required version, 3.0
06/23/2024 11:46:35 PM - humann.utilities - INFO: Using metaphlan version 3.0
06/23/2024 11:46:35 PM - humann.utilities - DEBUG: Check software, bowtie2, for required version, 2.2
06/23/2024 11:46:35 PM - humann.utilities - INFO: Using bowtie2 version 2.3
06/23/2024 11:46:35 PM - humann.config - INFO:
Run config settings:

DATABASE SETTINGS
nucleotide database folder = /home/utad/msiddiq7/humann_dbs/chocophlan
protein database folder = /home/utad/msiddiq7/humann_dbs/uniref
pathways database file 1 = /home/utad/msiddiq7/.local/lib/python3.7/site-packages/humann/data/pathways/metacyc_reactions_level4ec_only.uniref.bz2
pathways database file 2 = /home/utad/msiddiq7/.local/lib/python3.7/site-packages/humann/data/pathways/metacyc_pathways_structured_filtered_v24
utility mapping database folder = /home/utad/msiddiq7/utility_mapping

RUN MODES
resume = False
verbose = False
bypass prescreen = False
bypass nucleotide index = False
bypass nucleotide search = False
bypass translated search = True
translated search = diamond
threads = 1

SEARCH MODE
search mode = uniref90
nucleotide identity threshold = 0.0
translated identity threshold = 80.0

ALIGNMENT SETTINGS
bowtie2 options = --very-sensitive
diamond options = --top 1 --outfmt 6
evalue threshold = 1.0
prescreen threshold = 0.01
translated subject coverage threshold = 50.0
translated query coverage threshold = 90.0
nucleotide subject coverage threshold = 50.0
nucleotide query coverage threshold = 90.0

PATHWAYS SETTINGS
minpath = on
xipe = off
gap fill = on

INPUT AND OUTPUT FORMATS
input file format = fastq.gz
output file format = tsv
output max decimals = 10
remove stratified output = False
remove column description output = False
log level = DEBUG

06/23/2024 11:46:35 PM - humann.store - DEBUG: Initialize Alignments class instance to minimize memory use
06/23/2024 11:46:35 PM - humann.store - DEBUG: Initialize Reads class instance to minimize memory use
06/23/2024 11:46:53 PM - humann.humann - INFO: Load pathways database part 1: /home/utad/msiddiq7/.local/lib/python3.7/site-packages/humann/data/pathways/metacyc_reactions_level4ec_only.uniref.bz2
06/23/2024 11:46:53 PM - humann.humann - INFO: Load pathways database part 2: /home/utad/msiddiq7/.local/lib/python3.7/site-packages/humann/data/pathways/metacyc_pathways_structured_filtered_v24
06/23/2024 11:46:53 PM - humann.search.prescreen - INFO: Running metaphlan …
06/23/2024 11:46:53 PM - humann.utilities - DEBUG: Using software: /opt/miniconda3/envs/mpa/bin/metaphlan
06/23/2024 11:46:53 PM - humann.utilities - INFO: Execute command: /opt/miniconda3/envs/mpa/bin/metaphlan /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50/86-ELS_SME_L001_R1_001_humann_temp/tmpo6zw2o4m/tmp884uw5gi -t rel_ab -o /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50/86-ELS_SME_L001_R1_001_humann_temp/86-ELS_SME_L001_R1_001_metaphlan_bugs_list.tsv --input_type fastq --bowtie2out /media/msiddiq7/KangLab1/ASU_FTP_Server_Kang/MyKL2023/fastq/MG24_ur50/USACE_86-ELS_R1_uniref50/86-ELS_SME_L001_R1_001_humann_temp/86-ELS_SME_L001_R1_001_metaphlan_bowtie2.txt


First, I would definitely upgrade to a newer HUMAnN 3.x, as we’ve made some important bugfixes along the 3.0 branch. Second, if it feels like you’re getting stuck running MetaPhlAn, I would try a MetaPhlAn run on a sample outside of HUMAnN and see how that goes. The first time you run MetaPhlAn it does some downloading / indexing of its database. It’s possible that something is getting stuck in that process.

1 Like

Thank you a lot!
Is it possible to suggest some commands that could work?

I’ve been dealing with this problem for the past week or so and found the solution.

The problem is that metaphlan checks online to see if you have the most recent version of the database installed. If you do not, it asks if you want to install it. When running within humann3, it does not prompt the user to respond to this question, so it just sits there unanswered and nothing happens.

In your case, the solution is to run metaphlan by itself, using basically the same command found in your log:

/opt/miniconda3/envs/mpa/bin/metaphlan /path/to/my/input.fastq -t rel_ab -o /path/to/my/output/metaphlan_bugs_list.tsv --input_type fastq --bowtie2out /path/to/my/output/metaphlan_bowtie2.txt

You will be prompted to download the new database. Respond Y. Then retry running humann3 and it should work fine. If you want to continue using this version of the database (even if new ones come out), you want to include the following argument in the humann command:

–metaphlan-options “-x mpa_vJun23_CHOCOPhlAnSGB_202403”

For everyone else, you can repeat the commands above to get the new database. If you want to intentionally use an older database version, add the following argument to humann:

–metaphlan-options “–bowtie2db /path/to/metaphlan/db -x MY-INDEX” where MY-INDEX will be something like mpa_vOct22_CHOCOPhlAnSGB_202212.

That should do the trick.

1 Like

I checked in with the MetaPhlAn developers about this and they informed me that MetaPhlAn has an --offline flag that will prevent this behavior when running in a grid context (for example). Adding that to your --metaphlan-options should solve this problem when MetaPhlAn is called inside of HUMAnN. We’ll likely make this a default in a future HUMAnN release to avoid users needing to know about it ahead of time.

Thank you both @Alex_Grier and @franzosa !

I’m not sure if this is universally the case, but I did try passing "--offline" to --metaphlan-options and I got an error that humann couldn’t find the metaphlan_bugs_list.tsv file after the metaphlan step. Specifying the location of my installed metaphlan database and the index with --metaphlan-options --bowtie2db and -x fixed everything though.