Humann3 Database errors

Hi there, I am trying to run Humann3 and keep getting errors that are contradicting each other.

First I ran Humann3 in a conda environment with the v201901_v31 chocophlan database (mpa_vJun23_CHOCOPhlAnSGB_202403) and was running this as a submitted job on an HPC, not as an interactive job. I kept receiving this error:

Error message returned from metaphlan :
A newer version of the database (mpa_vJan25_CHOCOPhlAnSGB_202503) is available. Do you want to download it and replace the current one (mpa_vJun23_CHOCOPhlAnSGB_202403)?	[Y/N]Traceback (most recent call last):
  File "/tscc/nfs/home/lfreund/.conda/envs/RawMetagenomeWorkflow/bin/metaphlan", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/tscc/nfs/home/lfreund/.conda/envs/RawMetagenomeWorkflow/lib/python3.12/site-packages/metaphlan/metaphlan.py", line 1303, in main
    pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download'], pars['offline'])
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tscc/nfs/home/lfreund/.conda/envs/RawMetagenomeWorkflow/lib/python3.12/site-packages/metaphlan/__init__.py", line 312, in check_and_install_database
    choice = input('A newer version of the database ({}) is available. Do you want to download it and replace the current one ({})?\t[Y/N]'.format(index, previous_db_version))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
EOFError: EOF when reading a line

I then downloaded the updated Chocophlan database, used humann_config --update database_folders nucleotide to update the Chocophlan database, then tried to rerun Humann3.

I then got this error:

CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( mpa_vJan25_CHOCOPhlAnSGB_202503_bt2.tar ) that are not of the expected version. Please install the latest version of the database: v201901_v31

I have to process over 100 samples, and would like to not have to run Humann3 in an interactive job, where I would have to input “N” to not update the database every single time humann3 runs. I am not sure how to bypass these issues, could someone give me guidance please? Is there a globus database link for the updated Chocophlan vJan25 database that I should be using rather than the links available here? Thank you.

I believe this has been fixed in other versions of the software (or will be soon), but you can run MetaPhlAn with the --offline flag to suppress that warning message. For the second error, that just means you have a non-pangenome file (looks like a MetaPhlAn download?) in HUMAnN’s pangenome folder, which HUMAnN is unhappy about (we have this check as a safety measure mostly to avoid people accidentally mixing two different pangenome versions together and getting a weird hybrid output).

1 Like

Great I will try that, thank you! I am trying to run Humann which is when I get this metaphlan error…so would I add --metaphlan-options “--offline” to my humann code? Or is there another way to format this that is compatible with humman? Thank you so much for your help!

Update: I cannot get this “offline” option for metaphlan to work within Humann. I have tried the following formats:
–metaphlan-options “–offline”
–metaphlan-options “offline”

--metaphlan-options “-offline”

--metaphlan-options --offline

--offline

All of these lead to Humann to stop running. Is there some way around this within humann?

Additionally, I was able to fix this error for metaphlan by downloading the vJan25 Chocophaln database, but these are not the same files used by Humann. Could you please provide the link for the updated chocophlan database for humann? I work on an HPCC so the humann_databases --download function does not work for me. Thank you!

Hi there,

I still have not found a solution to this issue when running Humann. When metaphlan is run through Humann, I get this error and cannot seem to get around it no matter what I try (–offline, specifying the nucleotide database, etc)

When running Humann interactively, I cannot get past this step and the program crashes, which is not ideal when trying to run Humann on over 100 samples.

Again, the options I described above trying to specify the “offline” option for Metaphlan did not work in Humann. Please let me know how I can get past this error, or where I can find the updated Chocophlan database tar file that will get me beyond this error. Thanks!

Sorry for all the posts, I have been combing the forum and still cannot find an answer to my issue.

I tried the following code with a sample to test it:
humann -i /tscc/nfs/home/lfreund/scratch/IBD_ERP121770_MGMs/Kneaddata_Results/11546.stool.100001.baseline_paired_12.fastq -o ./Humann3_Results --metaphlan-options "--bowtie2db /tscc/nfs/home/lfreund/ps-hsulab/ProgramResources/Humann_Resources/chocophlan --offline" --threads 6 --input-format fastq --remove-temp-output --output-basename test_sample

When I run this, I get an error that the sample_bugs_list.tsv cannot be found.

Please, help me fix this error. I was able to run Metaphlan separately on the updated Chocophlan database, but Humann is giving me issues using the v201901_v31 chocophlan database.

Try to make sure the humann, metaphlan and its database versions all integrate . I think that “v201901_v31 chocophlan” and “mpa_vJun23_CHOCOPhlAnSGB_202403” (both in environment) and “mpa_vJan25_CHOCOPhlAnSGB_202503” in the code snippet appear to all refer to different versions of the database.

Default installs of the programs and using the database install and update config utility does not always resolve the conflict.

Unsure what kind of instance of humann that you are using, but if using docker, I believe that the config will not actually update when you use the utility because it is “containerized” and therefore its source is sheltered from changes by intention.

Thank you for your reply.

I think I was getting confused between the database used by Humann for alignment verses the chocophlan database that MetaPhlan is using, and didn’t realize that they required different files.

When running MetaPhlan (v4.1.1) separately, I received an error that I needed to update the database to the vJan25 version - so I downloaded the vJan25 database and used that for Metaphlan without issue.

I didn’t realize that this database was not compatible with Humanv3.9 and the v201901_v31 chocophlan database used for alignment, which was causing me problems, not to mention the –metaphlan-options “–offline” wasn’t working for me (probably because I only had the vJan25 chocophlan database downloaded at that time).

I then redownloaded the mpa_vJun23_CHOCOPhlAnSGB_202403 database, and reran Humann with the following code:
humann -i ${workplace}/Kneaddata_Results/${SAMPLE}_paired_12.fastq -o ${workplace}/Humann3_Results --metaphlan-options "--bowtie2db ${metaphlan_db_path} -x mpa_vJun23_CHOCOPhlAnSGB_202403 --offline" --threads 6 --input-format fastq --remove-temp-output --output-basename ${SAMPLE}

${metaphlan_db_path} was a variable of the path to the vJun23 chocophlan database, ${workplace} is my main working directory for this project, and ${SAMPLE} as the sample name.

This seems to be working now without issue, thankfully! Thank you and @franzosa for your input with this issue.