Humann database errors (a short novel)

Thank you for your reply and thoughts/comments.
Please find the outcome of your suggestions.

I used pip to first install human 4.0 alpha
pip install humann==4.0.0a1 --no-binary :all:

Confirmed the versions
humann --version
humann v4.0.0.alpha.1

metaphlan --version
MetaPhlAn version 4.1.1 (11 Mar 2024)

Databases: humann_config

HUMAnN Configuration ( Section : Name = Value )

database_folders : nucleotide = /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/databases/chocophlan_mpa_vOct22_CHOCOPhlAnSGB_202403/

database_folders : protein = /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/databases/uniref/

database_folders : utility_mapping = /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/databases/utility/utility_mapping/

Notably the chocophlan folder looks like this:

-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 3978538203 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.1.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 5311606700 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.2.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers  100610041 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.3.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 2655803348 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.4.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 2951357081 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.fna.bz2
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers   65502658 Apr  4  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.pkl
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 3978538203 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.rev.1.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 5311606700 Aug 29  2024 mpa_vOct22_CHOCOPhlAnSGB_202403.rev.2.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers      44092 Feb 22  2023 mpa_vOct22_CHOCOPhlAnSGB_202403_VINFO.csv
-rw-rw----+ 1 rb979 rds-XUr6B1Jhndg-managers  881484571 Feb 12 15:54 mpa_vOct22_CHOCOPhlAnSGB_202403_VSG.fna

So, the humann_test, works well, however it created a document called “_3_reactions”, which is a new development.
Ran 176 tests in 75.820s

# Reaction HUMAnN v HUMAnN_test

UNMAPPED 100.0000000
UNGROUPED 1912.1816208
UNGROUPED|g__Bacteroides.s__Bacteroides_thetaiotaomicron 943.6975327
UNGROUPED|g__Bacteroides.s__Bacteroides_stercoris 794.9003455

You mentioned trying to run metaphlan alone.. so I did the following:

metaphlan ../sg_metagenomics_Boston24/sg_raw_data_to_deposit/$1_1.fastq.gz,../sg_metagenomics_Boston24/sg_raw_data_to_deposit/$1_2.fastq.gz \
--input_type fastq \
--unclassified_estimation \
--add_viruses \
--index mpa_vOct22_CHOCOPhlAnSGB_202403 \
--bowtie2db databases/chocophlan_mpa_vOct22_CHOCOPhlAnSGB_202403 \
--bowtie2out bowtie_outputs_humann/$1.bt2.bz2 \
-o metaphlan_results/metaphlan_mpa_vOct22_CHOCOPhlAnSGB_202403/$1/april-profiled_metagenome_$1.txt \
--nproc 8

And this worked successfully.

The problems arise with running humann..

humann \
--input humann/merged_paired_ends/$1.fastq.gz \
--output humann/results/$1/ \
--bowtie-options '--threads 8' \
--metaphlan-options '--bowtie2db databases/chocophlan_mpa_vOct22_CHOCOPhlAnSGB_202403 --index mpa_vOct22_CHOCOPhlAnSGB_202403'

Which results in the error:

Output files will be written to: /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/humann/results
Decompressing gzipped file ...
Removing spaces from identifiers in input file ...

CRITICAL ERROR: The directory provided for ChocoPhlAn does not contain files of the expected format (ie '^SGB').

Please can you advise on what may be the problem from here?

1 Like