HumanN: which reference database? why so many ummaped reads?

Hello everyone,

I’m getting into the use of HumanN and as I understand, HumanN can be used with different reference databases: Unifref90, Uniref50, KEGG.

I would like to ask, which one is more recommended?

Until now, I have tried this commands:

humann2_databases --download chocophlan full /home/noe/Desktop/Shotgun_Lucia/humann2database/chocophlan

*parallel -j 1 ‘humann2 --metaphlan-options “–bt2_ps very-sensitive-local --min_alignment_len 50” --eta --threads 12 --input {} --output humann2_out/{/.} --memory-use maximum’ ::: fastq

And obtained a very high proportion of unmapped reads (68%), I attached the results.
humann2_pathabundance_relab_unstratified.txt (15.0 KB)
Is it normal or could it be due to an inappropriate reference database?

Thanks you in advance,


I recommend UniRef90 for human-associated communities (or anything else that is comparably well studied) and UniRef50 for everything else.