Hello, fellow bioBakers! ![]()
I’m encountering an infrastructure issue that prevents me from downloading HUMAnN 4 databases from institutional networks with certain security policies.
Problem Description
When attempting to download HUMAnN 4 databases using:
humann_databases --download utility_mapping full data/databases/humann/
humann_databases --download chocophlan full data/databases/humann/
humann_databases --download uniref uniref90_ec_filtered_diamond data/databases/humann/
All three downloads fail with the same error pattern:
CRITICAL ERROR: Unable to download and extract from URL: <database_url>
Root Cause
The download URL redirects to a subdomain of data.globus.org, and this domain is blocked by our institutional network security policies. Specifically, our cybersecurity team has classified Globus as a peer-to-peer application and has blocked access to globus.org domains as part of corporate network security measures established a few years ago.
Impact
I have confirmed that all three HUMAnN 4 database downloads fail due to this issue:
-
utility_mappingdatabase - FAILED -
chocophlandatabase - FAILED -
unirefdatabase - FAILED
Workaround Attempted
I was able to download the database manually from an external server, but this is not a sustainable solution for:
-
Other users in our institution
-
Automated workflows
-
Future database updates
Feature Request / Question
Could the bioBakery team consider providing:
-
Alternative download mirrors that don’t rely on Globus infrastructure (e.g., direct HTTP/HTTPS from huttenhower.sph.harvard.edu or other academic mirrors)?
-
Documentation on how to manually download and install databases when automated downloads are blocked?
-
Support for
--database-locationwith local file paths inhumann_databases(if not already fully supported)?
This issue affects users in institutions with strict network security policies that block peer-to-peer or file-sharing platforms. Providing alternative download methods would greatly improve accessibility for these users.
Environment Details
-
HUMAnN version: v4.0.0.alpha.1
-
Network environment: Institutional HPC cluster with corporate network security policies
-
Blocked domain:
*.globus.org(specificallydata.globus.org) -
Affected download URLs (all redirect to Globus):
http://huttenhower.sph.harvard.edu/humann_data/full_mapping_v4_alpha.tar.gzhttp://huttenhower.sph.harvard.edu/humann_data/chocophlan/chocophlan.v4_alpha.tar.gzhttp://huttenhower.sph.harvard.edu/humann_data/uniprot/uniref_ec_filtered/uniref90_annotated_v4_alpha_ec_filtered.tar.gz
Any guidance or alternative solutions would be greatly appreciated!
Best regards,
Fran ![]()