These are both databases over 15GB in size. I frequently have difficulties with downloading large files and I’d like to ensure the integrity of the files.
Question :
Where would I find the md5 sums or other check sums for these files?
Me, too. It’s also disappointing that humann_databases doesn’t support resuming broken downloads. It repeatedly fails after about 80% to 90% of ChocoPhlAn is downloaded.
(Python3) /verona/biostat/databases/bacteria$ humann_databases --download chocophlan full /verona/biostat/databases/bacteria/ Download URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v296_201901b.tar.gz
Downloading file of size: 16.01 GB
CRITICAL ERROR: Unable to download and extract from URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v296_201901b.tar.gz
(Python3) /verona/biostat/databases/bacteria$ ls -lht full_chocophlan.v296_201901b.tar.gz
-rw-r----- 1 biostat biostat 13G May 21 17:06 full_chocophlan.v296_201901b.tar.gz
(Python3) /verona/biostat/databases/bacteria$ humann_databases --download chocophlan full /verona/biostat/databases/bacteria/
Download URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v296_201901b.tar.gz
Downloading file of size: 16.01 GB
CRITICAL ERROR: Unable to download and extract from URL: http://huttenhower.sph.harvard.edu/humann_data/chocophlan/full_chocophlan.v296_201901b.tar.gz
(Python3) /verona/biostat/databases/bacteria$ ls -lht full_chocophlan.v296_201901b.tar.gz
-rwxrwx--- 1 biostat biostat 14G May 22 01:54 full_chocophlan.v296_201901b.tar.gz
I am located in Australia, which is far from anywhere else and download speed is about 0.65 MB/s.