Percentage mapped reads in RPK or raw read count?

Jigyasa3 · February 27, 2021, 10:20am

Hey All!

I wanted to ask the percentage mapped reads that are reported after mapping to a database is in RPK or raw read counts. I went through the tutorial, and I found the following line-

" * The “UNMAPPED” value is the total number of reads which remain unmapped after both alignment steps (nucleotide and translated search). Since other gene features in the table are quantified in RPK units, “UNMAPPED” can be interpreted as a single unknown gene of length 1 kilobase recruiting all reads that failed to map to known sequences."

Does this mean that when I am reporting 50% reads mapped to bacteria in my sample, it is actually in RPK units and not raw reads?

Looking forward to your reply.
Regards
Jigyasa

franzosa · March 1, 2021, 4:46pm

Before you normalize the genefamilies abundances, the value for “UNMAPPED” is the literal count of reads that HUMAnN didn’t assign to any sequences (note: these reads may have had hits to sequences, but HUMAnN rejected the hits for being low-confidence). To convert that number to a % unmapped reads you’d need to divide by the number of sequencing reads in the sample.

When you normalize the file, that’s when we’re treating that UNMAPPED count as if it were in RPK units (like the other genes in the file) so that it makes sense to compute a sum over the file. If you normalize to relative abundance units, the new value for UNMAPPED would be close to the true % unmapped value described above, but off by a little bit because the RPK values for the genes in the file are not the same as raw read counts. Put another way, the sum over genes’ RPK values is not the same as the total number of reads in the sample.

Topic		Replies	Views
Humann3 calculation of RPK values HUMAnN	1	330	June 11, 2023
Understanding RPK values HUMAnN	3	56	June 9, 2025
High value of UNINTEGRATED reads HUMAnN	7	1161	October 6, 2023
Unmapped reads - relative abundance and absolute counts in gene families output HUMAnN	1	334	September 21, 2023
Percent mapped reads HUMAnN	2	608	June 30, 2020

Percentage mapped reads in RPK or raw read count?

Related topics