Unfiltered metacyc pathway mapping files

I am using HUMAnN v4.0 alpha. I notice that the utility mapping files only contain the file metacyc_pathways_structured_filtered_v24_subreactions. Some pathways seem to be missing from this file compared to those listed in map_metacyc-pwy_name.txt.gz.

I believe when you filter the pathways with >4 RXNs, you miss out on some important environmental ones such as N2 fixation and Anammox which contain 1-3 RXNs. Is the unfiltered file available anywhere? What is the logic for filtering pathways with >4 RXNs?

Managed to get it from humann3

Glad you found what you were looking for. I forget exactly how we settled on four RXNs as the filter for MetaCyc in HUMAnN 2, but some issues we considered were 1) having more RXNs provides a more robust average measure of the pathway’s abundance and 2) gap-filling is trickier / riskier with very small pathways. Those are both ways of saying that it’s harder to get a good estimate of pathway coverage and abundance for small pathways.

I think we also figured that, for very small pathways, you could always investigate the abundances of the component RXNs instead. Indeed, in HUMAnN 4 we now provide RXN abundances as a new default output file to aid in this sort of analysis.