The bioBakery help forum

Humann3 will not run demo due to database issue

As far as I can tell, it is trying to run with a database that does not exist.
This is the output:

Error message returned from metaphlan :

Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

Downloading http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901b.tar

Warning: Unable to download http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901b.tar

Downloading http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901b.md5

Warning: Unable to download http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901b.md5
File “miniconda3/envs/humann/lib/python3.7/site-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901b.md5” not found!
File “/miniconda3/envs/humann/lib/python3.7/site-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901b.tar” not found!
MD5 checksums not found, something went wrong!

I am trying to run the humann3 demo, I have the databases downloaded, the most current versions and as far as I can tell there is no mpa_v30_CHOCOPhlAn_201901b. How do I fix this?

Thank you in advance for your help.

(FYI I have relocated this question from the HUMAnN category to the MetaPhlAn category as it appears more directly related to the latter’s setup.)

I am actually pretty sure the error is being caused by HUMAnN, particularly because of this line of code in the humann.py:

database_files=os.listdir(config.protein_database)
valid_format_database_files=[]
for file in database_files:
if not config.metaphlan_3p0_db_matching_uniref in file:
sys.exit("\n\nCRITICAL ERROR: The directory provided for the translated database contains files ( “+file+” )"+
" that are not of the expected version. Please install the latest version"+
" of the database: "+config.metaphlan_3p0_db_matching_uniref)

And this which is in the config.py:

metaphlan options

metaphlan_opts=["-t",“rel_ab”]
metaphlan_version={
“flag” : “–version”,
“major” : 3,
“minor” : 0,
“line” : -1,
“column” : 2}
metaphlan_3p0_db_version=“v30”
metaphlan_3p0_db_matching_uniref=“201901b”

Apologies - you are right about this being caused by HUMAnN. I had missed the 201901b in the URLs you initially provided. That database version is HUMAnN-specific (a HUMAnN-specific extension of the MetaPhlAn 201901 database), and so that’s likely why the download is failing. I will look into this - it’s not something that has come up before.

Thanks for looking into it. I think these people might be experiencing the same issue but posted in the wrong forum Humann3.0.0 docker not accepting metaphlan database 201901b - Data resource - The bioBakery help forum.

Can you clarify what command you’re running when you see this error? It looks as though MetaPhlAn is being told to use an index named “201901b” (which does not exist). This should not be happening automatically, but it could happen if (e.g.) someone specified that index name when calling the software or in a configuration file.

You mentioned that this issue arose when running the demo. If you can clarify which demo you were running / adapting when performing this test, that would also be helpful (in case there’s something unclear in our documentation re: specifying databases).

This is the command I was using:
humann --input lib/python3.7/site-packages/humann/tests/data/demo.fastq --output demo_out --nucleotide-database /users/clairmontl/humanndb/ --metaphlan-options “–index mpa_v296_CHOCOPhlAn_201901”

I also got the same error when running my own fasta files with this command:
humann --input sample.fasta --output demo_out --metaphlan-options “–index mpa_v296_CHOCOPhlAn_201901”

You shouldn’t need to manually specify the MetaPhlAn index when running HUMAnN - MetaPhlAn will automatically fetch an appropriate index the first time it runs and then reuse it thereafter. Was there a specific reason for wanting the custom command?

Yes, I added the --index command after the same command without --index returned the error I reported above. I had read in the forums that specifying --index could overcome some metaphlan database issues in earlier program versions.

Sorry for the late reply here - did removing the --index flag solve your problem or are you still having difficulties with your analysis?

No it did not. None of the options I tried allowed me to run the program. I did get it to run but I had to make alterations to the humann.py and config.py scripts to do so.