Pathway abundances using HUMAaN3

Brandon.Kim · February 14, 2022, 2:29am

Hi,

I am thinking about a passway analysis using HUMAnN3. Would it be possible to get an intermediate file that tells us what and how many gene families are involved in each path of a pathway? It will be like an abundance table of gene families for each path.

lauren.j.mciver · February 14, 2022, 9:45pm

Hello, There is a tool in the HUMAnN package named “humann_unpack_pathways”. If you run this tool it will generate a file of pathways stratified by the genes that are included in each.

Thanks!
Lauren

Brandon.Kim · February 15, 2022, 12:22pm

Thank you, Laren! It helps a lot. May I ask some follow-up questions?

I am not sure if I understand how to get the pathway abundances from the gene abundances.
Here is an example. The following table is what I got for PWY-4203 from humann_unpack_pathways with a provided demo file.

PWY-4203: volatile benzenoid biosynthesis I (ester formation) 22.7546997826
PWY-4203 unclassified 22.7546997826
PWY-4203 unclassified UniRef90_O64988 14.1999417238
PWY-4203 unclassified UniRef90_Q8GT21 21.6993097732
PWY-4203 unclassified UniRef90_Q9SPV4 14.3476630042
PWY-4203 unclassified UniRef90_Q6XMI3 19.8996848181

From the file “metacyc_pathways”, I found that the pathway is associated with RXN-6724, RXN-6723, RXN-6722, and RXN-6762.
And, another file “metacyc_reactions_level4ec_only.uniref” tells me the reaction-gene family connections:
RXN-6724 and UniRef90_Q8GT21;
RXN-6723 and UniRef90_Q6XMI3, UniRef90_Q9SPV4;
RXN-6722 and UniRef90_Q6XMI3;
RXN-6762 and UniRef90_O64988.
So, the abundances of each reaction are [21.6993097732, 19.8996848181+14.3476630042, 19.8996848181, 14.1999417238]
The number of complete copies should be 14.1999417238, which is supposed to be the abundance of PWY-4203. However, it does not match the result on the table. What am I doing wrong?

The tutorial says pathways with zero abundance are not included in the file.
Does that mean, if any of the RXNs in a pathway has zero abundance, the pathway won’t be included ? I guess that is why only a very small number of gene families are detected in the pathway table.
I believe the abundance tables are based on the comparisons between reads and representative genes of each family. Do the tables reflect any exceptional cases like comparisons with non-representative genes? I was curious because I found that A0A174PVG2 (given as UniRef90_A0A174PVG2 in the demo gene family abundance table) is not actually a gene family but an element of the gene family, UniRef90_A0A5P3AUM1, according to the UniRef server online (UniRef - Cluster: Uncharacterized protein (90%)).

Topic		Replies	Views
Humann output files HUMAnN	1	72	August 2, 2024
Is "pathabundance" associated with "genefamilies"? HUMAnN	4	989	March 23, 2021
Calculating pathway abundances from modified RXN/UniRef90 files HUMAnN	1	33	November 8, 2024
Obtain the pathway abundance file for some specific EC HUMAnN	1	355	January 24, 2023
Confusion with HUMAnN 'regroup_table' and higher-level pathway information HUMAnN	1	1196	February 2, 2024

Pathway abundances using HUMAaN3

Related topics