Thank you for your reply and thoughts/comments.
Please find the outcome of your suggestions.
I used pip to first install human 4.0 alpha
pip install humann==4.0.0a1 --no-binary :all:
Confirmed the versions
humann --version
humann v4.0.0.alpha.1
metaphlan --version
MetaPhlAn version 4.1.1 (11 Mar 2024)
Databases: humann_config
HUMAnN Configuration ( Section : Name = Value )
database_folders : nucleotide = /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/databases/chocophlan_mpa_vOct22_CHOCOPhlAnSGB_202403/
database_folders : protein = /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/databases/uniref/
database_folders : utility_mapping = /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/databases/utility/utility_mapping/
Notably the chocophlan folder looks like this:
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 3978538203 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.1.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 5311606700 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.2.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 100610041 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.3.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 2655803348 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.4.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 2951357081 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.fna.bz2
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 65502658 Apr 4 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.pkl
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 3978538203 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.rev.1.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 5311606700 Aug 29 2024 mpa_vOct22_CHOCOPhlAnSGB_202403.rev.2.bt2l
-rw-rw-r--+ 1 rb979 rds-XUr6B1Jhndg-managers 44092 Feb 22 2023 mpa_vOct22_CHOCOPhlAnSGB_202403_VINFO.csv
-rw-rw----+ 1 rb979 rds-XUr6B1Jhndg-managers 881484571 Feb 12 15:54 mpa_vOct22_CHOCOPhlAnSGB_202403_VSG.fna
So, the humann_test
, works well, however it created a document called “_3_reactions”, which is a new development.
Ran 176 tests in 75.820s
# Reaction HUMAnN v HUMAnN_test
UNMAPPED 100.0000000
UNGROUPED 1912.1816208
UNGROUPED|g__Bacteroides.s__Bacteroides_thetaiotaomicron 943.6975327
UNGROUPED|g__Bacteroides.s__Bacteroides_stercoris 794.9003455
You mentioned trying to run metaphlan alone.. so I did the following:
metaphlan ../sg_metagenomics_Boston24/sg_raw_data_to_deposit/$1_1.fastq.gz,../sg_metagenomics_Boston24/sg_raw_data_to_deposit/$1_2.fastq.gz \
--input_type fastq \
--unclassified_estimation \
--add_viruses \
--index mpa_vOct22_CHOCOPhlAnSGB_202403 \
--bowtie2db databases/chocophlan_mpa_vOct22_CHOCOPhlAnSGB_202403 \
--bowtie2out bowtie_outputs_humann/$1.bt2.bz2 \
-o metaphlan_results/metaphlan_mpa_vOct22_CHOCOPhlAnSGB_202403/$1/april-profiled_metagenome_$1.txt \
--nproc 8
And this worked successfully.
The problems arise with running humann..
humann \
--input humann/merged_paired_ends/$1.fastq.gz \
--output humann/results/$1/ \
--bowtie-options '--threads 8' \
--metaphlan-options '--bowtie2db databases/chocophlan_mpa_vOct22_CHOCOPhlAnSGB_202403 --index mpa_vOct22_CHOCOPhlAnSGB_202403'
Which results in the error:
Output files will be written to: /rds/project/rds-XUr6B1Jhndg/rb979_Microbiome/Metagenomics/metaphlan/humann/results
Decompressing gzipped file ...
Removing spaces from identifiers in input file ...
CRITICAL ERROR: The directory provided for ChocoPhlAn does not contain files of the expected format (ie '^SGB').
Please can you advise on what may be the problem from here?