HUMAnN3 minor updates

fanhuan · June 29, 2020, 7:47am

Hi there,

Very excited about HUMAnN3!

I hope you could help me understand one of the features updated: “Pangenome sequences must be covered at >50% of sites to be reported (tunable)”. Is it referring to [--translated-identity-threshold <Automatically: 50.0 or 80.0, Custom: 0.0-100.0>]? Does it have any relation with [--identity-threshold <50.0>] option in HUMAnN2? I guess what I am truly asking is if I let both run on default params, is one going to be more stringent than the other?

Thank you advance for your help!

Best,
Huan

franzosa · June 29, 2020, 1:17pm

We split out the identity and coverage-filter params in v3.0 to make it more clear which phase of the search they applied to (and to make them separately tunable). Translated search in v3.0 against UniRef90 is slightly more permissive (less conservative), with the identity threshold lowered from 90 to 80%.

This also tends to be more biologically realistic, as proteins in the same UniRef90 family won’t be 90% identical is all of their read-length windows; rather, they are 90% identical on average. Hence we expect some reads to align at <90% identity.

Topic		Replies	Views
Set --identity-threshold as 50 HUMAnN	1	279	August 28, 2020
Running HUMAnN: pre-computed protein blastx M8 input HUMAnN	8	568	June 8, 2022
High value of UNINTEGRATED reads HUMAnN	7	1259	October 6, 2023
Deciding how the genefamilies table is produced HUMAnN	2	387	January 25, 2021
Getting 67% unaligned reads with HUMANnN 3.0 HUMAnN	9	2298	June 28, 2022

HUMAnN3 minor updates

Related topics