Orthogonality filtering for Humann pathways

Hi- I used OneCodex to do WGS on mouse fecal samples, and processed the data on their platform through the HUMAnN3 program. In comparing the pathways identified between my three study groups (control n=5 independent mice; treatment #1 n=5 independent mice; treatment #2 n=7 independent mice) there are 233 pathways that are commonly expressed in all 3 groups. I would like to statistically compare the total abundance (RPK) of each of the 233 pathways between the 3 groups to see which comparisons are significantly different.

Because there are overlaps in the genes shared amongst the 233 pathways (identified by MetaCyc pathway ID), a biostatistician suggested that I first do orthogonality filtering for the HUMAnN pathways to narrow the list of 233 pathways down to those that contain unique gene sets and maintain more power in my analysis. After doing that, for the remaining pathways I would run a one-way ANOVA with post hoc bonferroni or FDR correction.

Does a script already exist for orthogonality filtering for Humann pathways? Any advice or help on this would be much appreciated! Thank you in advance.

One caveat first - make sure to normalize your data first so you aren’t comparing the raw RPK units (which are sensitive to sequencing depth). If you want to use these units then you would want to include sequencing depth as a covariate during your modeling to adjust for it that way.

For the orthogonality filter - we don’t have a script for that. The way I have done it in the past is to sort pathways by mean abundance, and then kick out any pathway that has a cross-sample Spearman correlation > 0.9 (for example) with a higher-ranked pathway. You can also kick out pathways that have a high correlation with the abundance of an individual species from MetaPhlAn, as such pathways may be acting as proxies of the species abundance rather than an interesting community-level function. Clustering your features and then picking one representative feature from each cluster would be an equivalently good approach.