Are these UniRef90s assigned to the
unclassified stratification, meaning that their abundance was identified from translated search? If so, it is possible that they represent residual host contamination. Conversely, UniRef90 abundance assigned to specific species is much less likely to be host-derived.
Since you’re dealing with RNA, you’ll want to host-deplete against a human transcript database in addition to the human genome. It’s possible that you have host reads in your sample that cover fused exons, in which case the read might not map to the genome (where the exons are not adjacent).