I am running a number of shotgun metagenomic samples through a custom pipeline that includes humann2
- the command for the primary humann2
run is
humann2 --input ${sample}.fastq.gz --taxonomic-profile ${sample}_profile.tsv --output $humann2_output --threads 8 --remove-temp-output --search-mode uniref90 --output-basename $sample
I have other steps in the workflow to regroup and rename, but in all of my initial ${sample}_genefamilies.tsv
outputs, I have quite a few UniRef50
rows, including some that have names (eg UniRef50_K1TBF9: Transposase (Fragment) 1059.7162421954
).
A typical file has ~400k rows, ~50k of which are uniref50s and half of the uniref50s have names.
These persist through humann2_renorm_table
, and then when I do humann2_rename_table
(expecting uniref90s), they’re all converted to eg UniRef50_K1TBF9: NO_NAME
.
Is there something about my database configuration that might be causing this?
$ humann2_config
HUMAnN2 Configuration ( Section : Name = Value )
output_format : remove_stratified_output = False
output_format : output_max_decimals = 10
output_format : remove_column_description_output = False
alignment_settings : prescreen_threshold = 0.01
alignment_settings : translated_query_coverage_threshold = 90.0
alignment_settings : evalue_threshold = 1.0
alignment_settings : translated_subject_coverage_threshold = 50.0
database_folders : utility_mapping = /pool001/vklepacc/databases/utility_mapping/
database_folders : protein = /pool001/vklepacc/databases/uniref/
database_folders : nucleotide = /pool001/vklepacc/databases/chocophlan/
run_modes : bypass_nucleotide_search = False
run_modes : verbose = False
run_modes : resume = False
run_modes : bypass_translated_search = False
run_modes : bypass_nucleotide_index = False
run_modes : threads = 1
run_modes : bypass_prescreen = False
$ humann2_databases
HUMANnN2 Databases ( database : build = location )
utility_mapping : full = http://huttenhower.sph.harvard.edu/humann2_data/full_mapping_1_1.tar.gz
chocophlan : DEMO = http://huttenhower.sph.harvard.edu/humann2_data/chocophlan/DEMO_chocophlan.v0.1.1.tar.gz
chocophlan : full = http://huttenhower.sph.harvard.edu/humann2_data/chocophlan/full_chocophlan_plus_viral.v0.1.1.tar.gz
uniref : DEMO_diamond = http://huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref_annotated/uniref90_DEMO_diamond.tar.gz
uniref : uniref90_diamond = http://huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref_annotated/uniref90_annotated_1_1.tar.gz
uniref : uniref50_ec_filtered_diamond = http://huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref_ec_filtered/uniref50_ec_filtered_1_1.tar.gz
uniref : uniref50_GO_filtered_rapsearch2 = http://huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref50_GO_filtered/uniref50_GO_filtered_rapsearch2.tar.gz
uniref : uniref50_diamond = http://huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref_annotated/uniref50_annotated_1_1.tar.gz
uniref : uniref90_ec_filtered_diamond = http://huttenhower.sph.harvard.edu/humann2_data/uniprot/uniref_ec_filtered/uniref90_ec_filtered_1_1.tar.gz
Thanks!