Gene family overlap in path abundance?

NBaileyNCL · November 26, 2020, 11:51am

Hi,

I couldn’t find this explicitly stated in the documentation:

I presume for pathway abundances, it allowed genes with identified functions to be assigned to > 1 pathway, and therefore the quantification is somewhat overlapping?

How could you take this into consideration with analysing results statistically? (I’d love to correlate differences in pathway abundance with a continuous metadata variable I have for the samples).

I used limma-trend on logCPM values calculated from the RPK values for analysing gene family abundance. Is this suitable, or would a different method be preferable

All help is much appreciated

NBaileyNCL · December 29, 2020, 10:11am

Bumping this thread for attention

franzosa · January 4, 2021, 7:29pm

Apologies for missing this question before the bump.

You’re correct about genes potentially contributing to multiple reactions which themselves can contribute to multiple pathways. We typically don’t take any special steps to compensate for these phenomena, instead just sum-normalizing pathways to relative abundance (or CPM) units and then performing metadata associations with MaAsLin 2 or other microbiome-appropriate methods. Sum-normalizing has the effect of adjusting for sequencing depth, but also allows you to focus on relative coverage of pathways within and between samples.

An alternative approach would be to sum-normalize your genes to CPM units and then run them BACK through HUMAnN to directly compute pathway abundance in CPM units. This approach is a little bit “cleaner” in that it doesn’t count any reads twice when adjusting for compositionality, and may also better handle data sets with large differences in the fraction of read mass assigned to pathways.

Both approaches are valid and we typically use the former for simplicity. As long as you describe your choice clearly (and are aware of potential PROs/CONs) you should be in good shape.

Topic		Replies	Views
Differential pathway abundance analysis by Masslin3 MaAsLin	1	70	April 10, 2025
Pathway abundance and cross-samples comparison HUMAnN	1	745	June 3, 2021
Differential abundance analysis of genes and pathways in metagenomics Downstream analysis and statistics	6	697	March 20, 2024
Humann3 pathway abundance table pathway sum and species sum different HUMAnN	6	2426	January 5, 2021
Do pathways maintain the compositional nature of the data? HUMAnN	3	825	May 10, 2020

Gene family overlap in path abundance?

Related topics