I have a metagenomics dataset from an intervention study and want to do functional annotation and pathway analysis. I’m very new to HUMAnN. So, I tried to install it with conda following the instructions in the link: humann3 – The Huttenhower Lab
The installation seems to be ok as when I ran HUMAnN unit test, it gave the output below.
----------------------------------------------------------------------
Ran 186 tests in 56.759s
OK
However, 2 errors happened when I run the demo and upgrade the database.
Error during demo run
I ran the HUMAnN demo with the command -
humann -i demo.fastq -o sample_results
Output
CRITICAL ERROR: Can not find input file selected: /home/pipeline/humann/demo.fastq
And I cannot find the file named as “demo.fastq”. Based on the instruction, demo dataset and database should be installed during the installation of HUMAnN 3.0. Am I wrong?
cannot upgrade the database
I ran the command below to upgrade database following the instruction.
humann_databases --download chocophlan full /bin/ --update-config yes
Output
Creating subdirectory to install database: /bin/chocophlan
CRITICAL ERROR: Unable to create directory: /bin/chocophlan
It seems that the directory cannot be created. However, I have the permission and it’s under my home directory.
drwxrwxr-x 2 claire claire 4 Aug 29 2020 bin/
Any suggestions or comments on how to solve these issues would be highly appreciated.
Dear Claire, concerning your first question, I think that demo.fastq data has to be downloaded from here: https://github.com/biobakery/humann/raw/master/examples/demo.fastq.gz. Dem databases, on the other hand, should be downloaded and formatted when you first run HUMAnN on the data
As far as the second question, maybe I’m wrong, but from your Output it seems that you are trying to write into the /bin/ folder, which contains the system binaries and is owned by root. You probably have to specify it as bin/ (without the first “/”) or, better, specify the full path (~/bin).
Hope this helps
Thanks a bunch for your suggestions. I managed to download the demo data and ran it.
Regarding the second issue, I changed the path and the download was initiated successfully. However, the downloading for all the 3 database failed.
After searching in this forum, I realize that I’m not the only one who has difficulty in downloading humann databases. Several people face the same issue for upgrading humann3 database in this post: https://forum.biobakery.org/t/difficulty-downloading-databases-in-humann3/1343/32
Now I’m trying to download from a dropbox provided by someone in this forum. http://cmprod1.cibio.unitn.it/databases/HUMAnN/
Just wondering you have the md5sum for the 3 files? Thus I can check whether I get the full files.
full_chocophlan.v296_201901b.tar.gz
full_mapping_v201901b.tar.gz
uniref90_annotated_v201901b_full.tar.gz
@lauren.j.mciver Also tag you to see whether you guys could provide md5 sum of the 3 humann3 database. Thus for these who have difficulty in upgrading humann3 databases could download and check the completeness of files.
I download the 3 database with wget and decompress the files and configure as you mentioned in another post.
However, when I run with the command below, I get a critical error.
humann -i Sample1_merged.fa -o 01_demo/
CRITICAL ERROR: The directory provided for ChocoPhlAn contains files ( map_eggnog_uniref90.txt.gz ) that are not of the expected version. Please install the latest version of the database: 201901b
Hi Claire, Sorry that error message is a bit confusing. HUMAnN checks that 1) the ChocoPhlAn files are of the expected version and that 2) they are just files for the ChocoPhlAn database. I will combine the two error messages so the error message is clearer. Your error message is related to the second error check. The ChocoPhlAn directory is just expected to contain the files for the ChocoPhlAn database; This is only files with species names (ie \'^[g__][s__]\'). If you move the mapping files (like “map_eggnog_uniref90.txt.gz”) from the ChocoPhlAn folder it should resolve the issue.