More than one reaction missing in pathway but pathway still shows up


Firstly, thanks for this tool - it’s super useful.

I’m a little confused about the way pathways are determined.

I noticed that the pathway Pathway: L-ascorbate biosynthesis IV (animals, D-glucuronate pathway) is being identified in a number of my samples - which I thought was a bit weird - but I thought this may be due to gap filling (which only allows for one missing reaction right?)

But on closer inspection of the reactions (using humann_regroup_table) and this file - more than one non-spontaneous reaction is missing from this pathway in all samples (RXN3DJ-64, GLUCURONATE-REDUCTASE, L-GULONOLACTONE-OXIDASE).

However, I then realised it was more complicated than that with some reactions being optional? (how is this determined?)

For example, for this pathway:

So does this mean that only PHOSPHOGLUCMUT-RXN GLUC1PURIDYLTRANS-RXN UGD-RXN and RXN-8783 are required?

If so, why does turning gap-filling off then make this pathway disappear, despite a number of samples with all four of these reactions?

Cheers and thanks in advance!

Sorry for the slow reply. Reactions being “optional” is something we infer from the MetaCyc annotations. HUMAnN handles them by allowing the reactions to contribute to the pathway’s abundance if they are detected but they do not punish the pathway’s abundance if they are absent.

It looks like there are five non-optional reactions in the pathway:

  3. UGD-RXN
  5. RXN-8783

So if you have gap-filling on, then we will not punish the pathway for missing exactly one of these five. But since it sounds like only four were detected, if you turn off gap-filling then we would no longer report this pathway.