HUMAnN database download

Hi,
I thought might be best to open a new issue. Latest database download issue - #5 by gjordaopiedade

Since yesterday that I have been trying to download the humann databases, but always get a critical error. I was wondering if it is a known issue or if I am potentially doing something wrong.

humann_databases --download chocophlan full /projects/0/gusr0506/goncalo/databases/HUMAnN --update-config yes
Creating subdirectory to install database: /projects/0/gusr0506/goncalo/databases/HUMAnN/chocophlan
Download URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz
CRITICAL ERROR: Unable to download and extract from URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz

I also noticed that when I visit the huttenhower lab page I get a invalid certificate warning:

Your connection is not private

Attackers might be trying to steal your information from huttenhower.sph.harvard.edu (for example, passwords, messages, or credit cards). Learn more

NET::ERR_CERT_DATE_INVALID

Thanks in advance!
Best,
Gonçalo

1 Like

I guess your server is back online.
I still couldn’t use humann_databases --download, but wget now works.

For those out there also struggling to download the database, I did this:

mkdir chocophlan
cd chocophlan
wget --no-check-certificate http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz 
tar -xf full_chocophlan.v201901_v31.tar.gz

humann_config --update database_folders nucleotide /FULL/PATH/chocophlan

And, same for uniref90:

mkdir uniref
cd uniref
wget --no-check-certificate https://huttenhower.sph.harvard.edu/humann_data/uniprot/uniref_annotated/uniref90_annotated_v201901b_full.tar.gz
tar -xf uniref90_annotated_v201901b_full.tar.gz

humann_config --update database_folders protein /FULL/PATH/uniref
1 Like

Thanks for your comments here. We did indeed have some website issues that are now resolved.

I am still getting the error. Unable to download and extract from URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz
When I tried to download. They seem to be not resolved!

Had this issue today with:
humann_databases --download chocophlan full $DATABSE_DIR

wget and tar work to get the database.

You can download it manually and extract it into a folder that you want (there was some safety issues for me when I was trying to download it manually and you have to give the permission)

Following up on this issue. We are experiencing the following behaviors when trying to download the humann3 databases .

HUMAnN3 download methods timeout when passing the following command

humann_databases --download chocophlan full <PATH> --update-config yes 

wget returns Unable to establish SSL connection errors

wget --no-check-certificate http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz 

We get a similar timeout with curl (we thought the issue could be following redirects). Checked wget and curl worked with a number of different downloads and they did. Also checked that we could download the database on a local Macbook with:

wget --no-check-certificate http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz 

It works with the Macbook, so I thought it could be a wget version specific bug. Updated on server (red hat) to wget 1.21.3 but no change in behavior. As others have pointed I can push the local files up to the server, but this is not very practical for larger databases. Any ideas on a more permanent solution?

Follow-up on this: Since yesterday, I have been getting this error when trying to install the uniref database (humann_databases --download uniref uniref90_diamond /home/rpmbuild)

CRITICAL ERROR: Unable to download and extract from URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz

When I try to use this code: wget --no-check-certificate https://huttenhower.sph.harvard.edu/humann_data/uniprot/uniref_annotated/uniref90_annotated_v201901b_full.tar.gz

I get the following error still: --2024-05-10 13:18:01-- https://huttenhower.sph.harvard.edu/humann_data/uniprot/uniref_annotated/uniref90_annotated_v201901b_full.tar.gz
Resolving huttenhower.sph.harvard.edu (huttenhower.sph.harvard.edu)… 199.94.60.28
Connecting to huttenhower.sph.harvard.edu (huttenhower.sph.harvard.edu)|199.94.60.28|:443… connected.
HTTP request sent, awaiting response… 403 Forbidden
2024-05-10 13:18:01 ERROR 403: Forbidden.

Is there another url I can use to access the databases?

Hi Lizzie, Thank you for your post and sorry for the download issues. We had to recently restart our server. The download issues should now be fixed. We are working on an automated way to resolve this issue if it arises again in the future.

Thanks again!
Lauren

Hi, I encountered the same issue using both wget and humann_databases --download chocophlan full , and the issue remains today. Any updates?

These issues should all be resolved. We are working on a new system for downloads that will add robustness to the routine maintenance windows / downtime of our Research Computing infrastructure.

Dear all,

I would like to report the database downloading problem again. When I was using
wget -c -v --no-check-certificate https://huttenhower.sph.harvard.edu/humann_data/uniprot/uniref_annotated/uniref90_annotated_v201901b_full.tar.gz
and
wget -c -v --no-check-certificate http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz
to download the database, the error appeared again with error code: Unable to establish SSL connection. In this condtion, how can I download the database?

Shuyuan

I am having this issue as well when trying to download chocophlan v31. I have tried all the above steps and the wget command hangs without downloading anything

wget http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz --no-check-certificate

--2024-07-11 09:47:52--  http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz
Resolving huttenhower.sph.harvard.edu (huttenhower.sph.harvard.edu)... 199.94.60.28
Connecting to huttenhower.sph.harvard.edu (huttenhower.sph.harvard.edu)|199.94.60.28|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz [following]
--2024-07-11 09:47:52--  https://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz
Connecting to huttenhower.sph.harvard.edu (huttenhower.sph.harvard.edu)|199.94.60.28|:443... connected.

Hi @Shuyuan_Zhang and @Roshonda_Jones , Thank you both for your detailed posted. I could replicate the issue with the option “–no-check-certificate”. If you remove this option from the wget command the download should start without error. Please try it out and let me know if you run into any other issues.

Thank you,
Lauren

Dear Lauren,

When I tried to download two databases without the option –no-check-certificate using the wget -c -v link command, the databases were not downloaded. The error code was Unable to establish SSL connection. Could you provide some additional suggestions for resolving this issue?

Best wishes,
Shuyuan

Hi @Shuyuan_Zhang Can you try again without the “-c -v link” option? Our hosting method might not support those features. Alternatively have you tried using the “humann_databases” tool? It will download the database for you if you provide it the name of the database you need plus will update the HUMAnN config file.

Thanks!
Lauren

Dear Lauren,

Thanks for your reply. I believe I have identified the cause of the problem. When I tried to download the two databases onto a private laptop with sufficient storage space (as the two databases are 35 GB), the process worked using the wegt command (I have not yet tried the humann_databases command). However, when I attempted to download the databases onto our university’s Linux server (link: Accessing CREATE HPC - King's College London e-Research), none of the three commands (wget, curl, and humann_databases) worked, returning the error code ‘cannot build SSL connection’.

This is somewhat inconvenient as my university laptop does not have enough space. Consequently, I must download the databases onto another private laptop, then submit them to our server. I do not think it is an error, but it is a rather inconvenient disadvantage.

Best wishes,
Shuyuan

Hi Shuyuan, Thank you for the update. We host our files through Globus. Would you mind possibly trying a different location for the wget and let me know if it works okay? If so, please try:

$ wget https://g-227ca.190ebd.75bc.data.globus.org/humann_data/chocophlan/full_chocophlan.v201901_v31.tar.gz

If you would let me know if that works on your universities’ servers, that would be great! If not, I can work on figuring out other options on our end. Thanks for all your help and feedback!

Thanks!
Lauren

Dear Lauren,

Thanks for your reply. I tried to download the ChocoPhlAn database using your latest code, and it worked on our university server. Thank you very much. Could you please provide the link to the full UniRef90 database?

I also would like to report a similar issue with Kneaddata: I am unable to download the recommended database (hg37_and_human_contamination) onto our university server. I believe other users in this post have experienced the same issue, where they cannot download the database to their server.

Best wishes,
Shuyuan

Hi Shuyuan, Thank you so much for trying. It is really useful to know this method works for you. I think the issue is with our redirect. I will look into what changes we can make on our end.

Anything available at https://huttenhower.sph.harvard.edu/ should also be available at https://g-227ca.190ebd.75bc.data.globus.org/ for download.

Thank you,
Lauren