Very low mapping rate of Hummann v4alpha

Hello I tried the new humann 4.0.0a1

Surprisingly I get a very low mapping rate.

The READS_UNMAPPED in the genefamilies represent around 97%.

Ok, I have some pig gut samples where I expect lower mapping rate. but also human control samples, which are public samples. represent <10%.

Is this to be expected?

This would be a very low mapping rate for human microbiome samples. Are you sure they’ve been properly QCed (adapters removed, quality trimmed, and so forth)? That’s a common cause of low mapping rates across the board.

Hi @SilasK,

Did you check this forum?: READS_UNMAPPED Is Not Presented in CPM Format - #2 by franzosa

I also was really scared for a second because I’m using HUMAnNv4.0.0.a, and when I initially normalized by relative abundance the mapping rate in my uniref genefamilies output was extremely low. However, in that forum page they discuss that, in HUMAnN4 specifically, the units of the unmapped_reads in the genefamilies output are in raw reads, whereas the rest of the genes are normalized to copies per million (CPM) units. Because the raw reads number can be much greater than 1 million, it can look like there is a much lower mapping rate than there truly is. You can look in the log file for the true mapping rate of the genefamilies output.

Hope this helps,

Gillian