Getting 67% unaligned reads with HUMANnN 3.0

Hi @fbeghini, @franzosa !!!
I am getting 67.78% unaligned reads with HUMAnN 3.0 after nucleotide search tier. After the translated search (against uniref90_ec_filtered database) it reduced to 63.05%.
Is it very high amount and unlikely? If yes, what should I do?
What is a normal range of unaligned reads for this step?


What environment are your reads from?

I am working with gut metagenome.

That is a bit high for the gut in my experience. On a recent analysis of some gut metagenomes with HUMAnN 3 I saw ~55-75% of reads mapping to pangenomes (25-45% unmapped; IQRs).

What sequencing depth are you working with? If it’s particularly low it’s possible that many genes are failing to reach the 50% minimum coverage threshold and so their reads are treated as unmapped.

My files are of different size. Some are around 2 Gb and some like 4Gb around. Should I use files with same sequencing depth. Should I rarefy/normalize them before HUMAnN run? If yes, how to do that?


No need to rarefy. We tend to exclude files with very low sequencing depths (where something appears to have gone wrong, e.g. <1M reads) but otherwise you should be OK with variable depths (downstream normalization will correct for this).

