MetaCyc pathway hierarchical structure like KEGG pathway map file?

Chi · April 2, 2021, 3:21pm

Dear bioBakery forum,

First of all, thanks for the awesome tools you are developing.

I want to use the MetaCyc pathway abundance data from HUMAaN to perform some further analysis (e.g. lefse) related with the pathway hierarchical structure, such as MetaCyc: DHGLUCONATE-PYR-CAT-PWY: glucose degradation ----> ?, like the KEGG map file (e.g. ko00052:Galactose_metabolism ----> Carbohydrate_metabolism ----> Metabolism). I think the enrichment analysis at higher level is important to capture the functional info in data set fast. However, I donot find such file in MetaCyc website and other tools. I have integrated the KEGG pathway file input based on the HUMAaN result into my R package file2meco(GitHub - ChiLiubio/file2meco: Tranform files to the microtable object in microeco package) and microeco to make the downstream data analysis easy for more researchers in this field. Do you know whether the similar map file is available for me? Sorry if I am missing something important.

Chi · April 8, 2021, 6:29am

Dear guys,

Because of no ready-made MetaCyc pathway mapping file for me, I build a mapping file by collecting the superclasses in MetaCyc pathway wetsites. Now this mapping file is available in R package file2meco(GitHub - ChiLiubio/file2meco: Tranform files to the microtable object in microeco package) under data/MetaCyc_pathway_map.RData . Additionally, the MetaCyc and KEGG pathway enrichment analysis using HUMAnN software results are both supported in file2meco function humann2meco() and in the further analysis in microeco package (GitHub - ChiLiubio/microeco: An R package for data analysis in microbial community ecology). Currently, only the top two superclasses are used as there are multiple pathway mapping structure in 524 pathways (in total 2703), such as FERMENTATION-PWY and ENTNER-DOUDOROFF-PWY. Generally, the top two superclasses are relatively constant and are very useful in the top level enrichment analysis. Thus, this method is conservative. Any suggestion is welcome.

Chi · April 8, 2021, 7:00am

Actually, I also wish to split the multiple mapping structure and attempt to sum the abundance for the multiple structure. For example, GLUTATHIONESYN-PWY has two lines in the superclasses in MetaCyc pathway website (MetaCyc glutathione biosynthesis).

Two routes have different class numbers, which is extremely difficult to make sure the levels for the superclasses and do the abundance enrichment calculation. If the numbers are same, it is relatively easy to calculate the abundance by one to many, which has been considered in the cal_abund() function in the package microeco.

Chi · April 16, 2021, 3:18pm

A similar topic and the recent reply also referred to the mapping file MetaCyc hierarchy to invetigate/identify specific pathways

sarahi · March 15, 2022, 11:20pm

Hi Chi, this is brilliant and almost exactly what I needed. I’m just wondering if it’s possible to create these plots based on relative abundance?

Chi · March 16, 2022, 4:18am

Sure. Please have a try and feel free to tell me if there are some problems.

sarahi · March 17, 2022, 12:03am

I think it works. I’m trying to get the percentage output for this data, but I have no idea how to do it. How would I go about this?

Chi · March 20, 2022, 9:56am

How about run the example of HUMAnN metagenomic results following Chapter 8 file2meco package | Tutorial for R microeco package (v0.7.0) ?
I think the cal_abund function may be what you need. The RPK or relative abundance are both supported.

Mohamed_S.AboHoussie · January 15, 2024, 6:43pm

Hi,
I have a question please about the microeco R package below what are these variables rever to?
I get a genefamilies.tsv , pathabundance.tsv and the coverage so how can I use theme in this tutorial ??

sample_file_path <- system.file("extdata", "example_metagenome_sample_info.tsv", package="file2meco")
match_file_path <- system.file("extdata", "example_metagenome_match_table.tsv", package="file2meco")

# MetaCyc pathway database based analysis
# use the raw data files stored inside the package for MetaCyc pathway database based analysis
abund_file_path <- system.file("extdata", "example_HUMAnN_MetaCyc_abund.tsv", package="file2meco")

Chi · January 16, 2024, 1:09am

Hi. It should be “pathabundance.tsv”, which has the pathway abundances.

Mohamed_S.AboHoussie · January 16, 2024, 10:51am

Yes I know that… I’m asking which variable for genefamilies.tsv and pathabundance.tsv in your script in the screen shot that I sent before

Chi · January 16, 2024, 2:20pm

The codes in the screen shot are the file path used in the example. You can ignore them and use the humann2meco function directly like this:

d1 <- humann2meco(feature_table = "your_pathabundance.tsv", db = "MetaCyc")

Chi · December 5, 2024, 5:42am

I have recently updated the file2meco package, manually curating the ontology information for all over 3000 MetaCyc metabolic pathways. For metabolic pathways with multiple labels at the Superclass level, I have used the character “&&” to connect them. If the user need to filter relevant metabolic pathways from the table, please use regular expressions to match, and direct filtering can produce incorrect results. This “&&” character will be automatically recognized by the cal_abund function of microtable class that calculates abundance, and then it will be split and calculated separately. So, if a metabolic pathway M has Superclass1 as A&&B, then the final calculation of RPK or relative abundance for both A and B will include M.
The command to view this updated table in R is file2meco::MetaCyc_pathway_map

Topic		Replies	Views
MetaCyc hierarchy to invetigate/identify specific pathways HUMAnN	16	3174	December 5, 2024
Linking metacyc pathway to EC gene family HUMAnN	3	744	August 10, 2021
Kegg on Humann2 HUMAnN	15	5661	April 19, 2021
How to make a KEGG map from humann2 pathways HUMAnN	9	2588	April 29, 2021
MetaCyc vs. KEGG HUMAnN	3	1371	July 21, 2023

MetaCyc pathway hierarchical structure like KEGG pathway map file?

Related topics