I have run into this as well, and it looks like the bot closed this issue before it was addressed:
opened 07:31AM - 31 Jul 20 UTC
closed 01:10PM - 05 May 21 UTC
humann3 (3.0.0.alpha.3) requires that the `--protein-database` directory only co… ntain the `*.dmnd` named as it is looking for (eg., `uniref90_201901.dmnd`). If anything else is in the same directory, humann3 throws an error:
```
CRITICAL ERROR: The directory provided for the translated database contains files ( XXX ) that are not of the expected version. Please install the latest version of the database: 201901
```
...which seems completely unnecessary. Also the required file naming (eg., `201901`) don't take custom databases into account, which could have other names.
conda info:
```
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_llvm conda-forge
bcbio-gff 0.6.6 py_0 bioconda
biom-format 2.1.8 py37hc1659b7_0 conda-forge
biopython 1.77 py37h8f50634_0 conda-forge
blast 2.9.0 h20b68b9_1 bioconda
boost 1.68.0 py37h8619c78_1001 conda-forge
boost-cpp 1.68.0 h11c811c_1000 conda-forge
bowtie2 2.4.1 py37h4ef193e_2 bioconda
brotlipy 0.7.0 py37h8f50634_1000 conda-forge
bx-python 0.8.9 py37h5266303_0 bioconda
bzip2 1.0.8 h516909a_2 conda-forge
ca-certificates 2020.6.20 hecda079_0 conda-forge
certifi 2020.6.20 py37hc8dfbb8_0 conda-forge
cffi 1.14.0 py37hd463f26_0 conda-forge
chardet 3.0.4 py37hc8dfbb8_1006 conda-forge
click 7.1.2 pyh9f0ad1d_0 conda-forge
cmseq 1.0 pyh5ca1d4c_0 bioconda
cryptography 2.9.2 py37hb09aad4_0 conda-forge
cycler 0.10.0 py_2 conda-forge
dendropy 4.4.0 py_1 bioconda
diamond 0.9.24 ha888412_1 bioconda
fasttree 2.1.10 h516909a_4 bioconda
freetype 2.10.2 he06d7ca_0 conda-forge
future 0.18.2 py37hc8dfbb8_1 conda-forge
glpk 4.65 he80fd80_1002 conda-forge
gmp 6.2.0 he1b5a44_2 conda-forge
gnutls 3.6.13 h79a8f9a_0 conda-forge
h5py 2.10.0 nompi_py37h90cd8ad_103 conda-forge
hdf5 1.10.6 nompi_h3c11f04_100 conda-forge
htslib 1.10.2 hd3b49d5_1 bioconda
humann 3.0.0.alpha.3 py37h83b1523_0 biobakery
icu 58.2 hf484d3e_1000 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
iqtree 2.0.3 h176a8bc_0 bioconda
kiwisolver 1.2.0 py37h99015e2_0 conda-forge
krb5 1.17.1 hfafb76e_1 conda-forge
ld_impl_linux-64 2.34 h53a641e_5 conda-forge
libblas 3.8.0 17_openblas conda-forge
libcblas 3.8.0 17_openblas conda-forge
libcurl 7.71.1 hcdd3856_0 conda-forge
libdeflate 1.6 h516909a_0 conda-forge
libedit 3.1.20191231 h46ee950_0 conda-forge
libffi 3.2.1 he1b5a44_1007 conda-forge
libgcc-ng 9.2.0 h24d8f2e_2 conda-forge
libgfortran-ng 7.5.0 hdf63c60_6 conda-forge
liblapack 3.8.0 17_openblas conda-forge
libopenblas 0.3.10 h5ec1e0e_0 conda-forge
libpng 1.6.37 hed695b0_1 conda-forge
libssh2 1.9.0 hab1572f_2 conda-forge
libstdcxx-ng 9.2.0 hdf63c60_2 conda-forge
llvm-openmp 10.0.0 hc9558a2_0 conda-forge
lzo 2.10 h14c3975_1000 conda-forge
mafft 7.471 h516909a_0 bioconda
matplotlib-base 3.1.1 py37hfd891ef_0 conda-forge
metaphlan 3.0.1 pyh5ca1d4c_0 bioconda
muscle 3.8.1551 hc9558a2_5 bioconda
ncurses 6.1 hf484d3e_1002 conda-forge
nettle 3.4.1 h1bed415_1002 conda-forge
numpy 1.18.5 py37h8960a57_0 conda-forge
openssl 1.1.1g h516909a_0 conda-forge
pandas 1.0.5 py37h0da4684_0 conda-forge
patsy 0.5.1 py_0 conda-forge
pcre 8.44 he1b5a44_0 conda-forge
perl 5.26.2 h516909a_1006 conda-forge
perl-archive-tar 2.32 pl526_0 bioconda
perl-carp 1.38 pl526_3 bioconda
perl-common-sense 3.74 pl526_2 bioconda
perl-compress-raw-bzip2 2.087 pl526he1b5a44_0 bioconda
perl-compress-raw-zlib 2.087 pl526hc9558a2_0 bioconda
perl-exporter 5.72 pl526_1 bioconda
perl-exporter-tiny 1.002001 pl526_0 bioconda
perl-extutils-makemaker 7.36 pl526_1 bioconda
perl-io-compress 2.087 pl526he1b5a44_0 bioconda
perl-io-zlib 1.10 pl526_2 bioconda
perl-json 4.02 pl526_0 bioconda
perl-json-xs 2.34 pl526h6bb024c_3 bioconda
perl-list-moreutils 0.428 pl526_1 bioconda
perl-list-moreutils-xs 0.428 pl526_0 bioconda
perl-pathtools 3.75 pl526h14c3975_1 bioconda
perl-scalar-list-utils 1.52 pl526h516909a_0 bioconda
perl-types-serialiser 1.0 pl526_2 bioconda
perl-xsloader 0.24 pl526_0 bioconda
phylophlan 3.0 py_5 bioconda
pigz 2.3.4 hed695b0_1 conda-forge
pip 20.1.1 py_1 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pyopenssl 19.1.0 py_1 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pysam 0.16.0.1 py37hc501bad_0 bioconda
pysocks 1.7.1 py37hc8dfbb8_1 conda-forge
python 3.7.6 cpython_h8356626_6 conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-lzo 1.12 py37h81344f2_1001 conda-forge
python_abi 3.7 1_cp37m conda-forge
pytz 2020.1 pyh9f0ad1d_0 conda-forge
raxml 8.2.12 h516909a_2 bioconda
readline 8.0 hf8c457e_0 conda-forge
requests 2.24.0 pyh9f0ad1d_0 conda-forge
samtools 1.10 h9402c20_2 bioconda
scipy 1.5.0 py37ha3d9a3c_0 conda-forge
seaborn 0.10.1 1 conda-forge
seaborn-base 0.10.1 py_1 conda-forge
seqkit 0.13.0 0 bioconda
setuptools 49.1.0 py37hc8dfbb8_0 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
sqlite 3.32.3 hcee41ef_0 conda-forge
statsmodels 0.11.1 py37h8f50634_2 conda-forge
tbb 2020.1 hc9558a2_0 conda-forge
tk 8.6.10 hed695b0_0 conda-forge
tornado 6.0.4 py37h8f50634_1 conda-forge
trimal 1.4.1 h6bb024c_3 bioconda
urllib3 1.25.9 py_0 conda-forge
wheel 0.34.2 py_1 conda-forge
xz 5.2.5 h516909a_0 conda-forge
zlib 1.2.11 h516909a_1006 conda-forge
```
Hi @nickp60 ,
(Edited for clarity and additional details)
Thank you for reaching out. Since the lab handles bugs and feature requests through the bioBakery forum instead of GitHub, issues there are always auto-closed, although you’re welcome (and encouraged) to share them here instead.
HUMAnN intentionally will not read multiple databases of different sources from the same directory, since it instead uses that functionality for other purposes. Specifically, HUMAnN can work with a protein database that’s been split into chunks for computational efficiency (mapping the reads to them serially). Therefore HUMAnN models a protein database as a folder with one or more database chunks in it (derived from the same input set of protein sequences).
So if you mix unrelated chunks in the same folder HUMAnN will produce an error. However, if you want to have two separate DBs, even if each is only represented by a single chunk, they can be used without problems in separate folders.
To avoid confusion in the future for other users, we now have updated the HUMAnN user manual with the same information as well.
Regards,
Sagun
I also noticed the same issue. It is not an issue of having multiple incompatible protein databases in the same directory because the same error occurs even when my custom protein DB is moved into a new directory where it is the only file. It appears to simply be an issue with the file naming alone as when I add “201901b” to the end of the file name for my diamond database I do not get the error.