Humann3 without prior metaphlan run

Hi all,

Thanks for creating this forum for troubleshooting! I am struggling to learn how to use HUMANn3.9 on just a basic merged fastq.gz. I am running this from within a conda env from within a Linux cluster. I followed the instructions on “GitHub - biobakery/humann: HUMAnN is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).” to install Humann3.9 and download the associated Chocophlan and Uniref databases. The issue seems to be arising with Metaphlan. I already confirmed that humann3.9 is able to access the correct Chocophlan and uniref databases via “humann_config”

The command I am running is “humann3 --input humann3_samplev4/merged_reads/0167968221_400_reads_merged.fastq.gz --output humann_samplev4_manual --metaphlan-options “–bowtie2db ~/jjawahar/databases/metaphlan/vOct22/mpa_vOct22_CHOCOPhlAnSGB_202403_bt2.tar” --threads 4”

However, I received an error suggesting that metaphlan is looking for a database file “WARNING: It seems that you do not have Internet access. ERROR: Cannot find a local database. Please run MetaPhlAn using option “-x <database_name>”. You can download the MetaPhlAn database from \n Index of /biobakery4/metaphlan_databases”, suggesting that a separate Metaphlan database is required to be downloaded from Index of /biobakery4/metaphlan_databases/bowtie2_indexes

I downloaded the linked file " mpa_vOct22_CHOCOPhlAnSGB_202403_bt2.tar" from the folder “bowtie2_indexes” in the link above

However, even after specifying the bowtie2db location in my command as “humann3 --input humann3_samplev4/merged_reads/0167968221_400_reads_merged.fastq.gz --output humann_samplev4_manual --metaphlan-options “–bowtie2db ~/jjawahar/databases/metaphlan/vOct22/mpa_vOct22_CHOCOPhlAnSGB_202403_bt2.tar” --threads 4” I still see that metaphlan is unable to run correctly.

Is metaphlan required to be run separately prior to running Humann3.9?
This was unclear to me in the tutorial page.

Additionally, do I need to provide the path to the metaphlan bowtie database or some other type of file?
Unsure if the error is occurring because I need to unpack the .tar file first. Thanks so much for your help!

I’ve pasted the relevant part of the Humann3 log below.

03/31/2025 01:32:30 PM - humann.store - DEBUG: Initialize Alignments class instance to minimize memory use
03/31/2025 01:32:30 PM - humann.store - DEBUG: Initialize Reads class instance to minimize memory use
03/31/2025 01:32:49 PM - humann.humann - INFO: Load pathways database part 1: /nfs/jjawahar/miniforge3/envs/humann3.9/lib/python3.7/site-packages/humann/data/pathways/metacyc_reactions_level4ec_only.uniref.bz2
03/31/2025 01:32:49 PM - humann.humann - INFO: Load pathways database part 2: /nfs/jjawahar/miniforge3/envs/humann3.9/lib/python3.7/site-packages/humann/data/pathways/metacyc_pathways_structured_filtered_v24_subreactions
03/31/2025 01:32:49 PM - humann.search.prescreen - INFO: Running metaphlan …
03/31/2025 01:32:49 PM - humann.utilities - DEBUG: Using software: /nfs/jjawahar/miniforge3/envs/humann3.9/bin/metaphlan
03/31/2025 01:32:49 PM - humann.utilities - INFO: Execute command: /nfs/jjawahar/miniforge3/envs/humann3.9/bin/metaphlan /nfs/jjawahar/scripts/humann3_samplev2/0167968221_400_reads_merged_humann_temp/tmp4wbrblor/tmpl433ubd7 -t rel_ab -o /nfs/jjawahar/scripts/humann3_samplev2/0167968221_400_reads_merged_humann_temp/0167968221_400_reads_merged_metaphlan_bugs_list.tsv --input_type fastq --bowtie2out /nfs/jjawahar/scripts/humann3_samplev2/0167968221_400_reads_merged_humann_temp/0167968221_400_reads_merged_metaphlan_bowtie2.txt --nproc 4
03/31/2025 01:32:51 PM - humann.utilities - DEBUG: b’WARNING: It seems that you do not have Internet access.\nERROR: Cannot find a local database. Please run MetaPhlAn using option “-x <database_name>”.\n You can download the MetaPhlAn database from \n Index of /biobakery4/metaphlan_databases \n \n’
03/31/2025 01:32:51 PM - humann.utilities - CRITICAL: Can not find file /nfs/jjawahar/scripts/humann3_samplev2/0167968221_400_reads_merged_humann_temp/0167968221_400_reads_merged_metaphlan_bugs_list.tsv

You should definitely be able to run MetaPhlAn within HUMAnN. It can be helpful to run a single sample with MetaPhlAn outside of HUMAnN to make sure it gets its databases configured properly first. After that you can just specify the database version with the -x flag and it will work as expected. Note that there are a couple of files for MetaPhlan: 1) a bowtie2 database of the markers and 2) a PKL file associating markers to taxa. It seems like you’re only manually specifying the first one in your command? But again, if the database is installed properly you shouldn’t need to mess with any of that, and you only need to do the -x command if you want to specify a particular database to use.

Let me know if you’re still struggling after this new information and I can potentially bump this over to the MetaPhlAn channel for extra help there.

Can you give an example?

I am new to metaphlan and Humann, I got human and metaphlan installed via pip. metaphlan 4.1 and human 3.9. Then I run test sample, It got metaphlan database error, how can I configure metaphlan database for humann run

Thanks so much for your attention

Are you able to run MetaPhlAn outside of HUMAnN? What was the error you saw?

Hi Dr. Franzosa! Thanks for your help. I was able to get the script to work on a single sample by specifying
–metaphlan-options “–bowtie2db /databases/metaphlan/vJun23/ -x mpa_vJun23_CHOCOPhlAnSGB_202307”

The MetaPhlAn error message was useful for debugging this, but it could be helpful to specify this in the HUMANn3 installation and tutorial, since it did require separately downloading and installing the correct files (And unpacking the .tar files as you had mentioned, sorry about that!) from Index of /biobakery4 first