Differential abundance analysis of genes and pathways in metagenomics

Hello everyone! I am currently working with WGS samples obtained from human skin for my research project, which involves performing a functional analysis and identifying differentially abundant pathways between healthy and non-healthy samples. I utilized humann 3.0 to identify gene families and pathway abundances, but I am facing challenges in determining the appropriate approach for the analysis.

Can you suggest the most suitable tests to identify the differentially expressed pathways? Should I use abundance as RPK or normalize to relative abundance? What are the optimal methods for visualizing my data?

There are gene and pathway models in anpan. For the gene model using the raw abundance values from humann will be fine. For the pathway model you’ll need to transform your data to log10 abundance (my instinct is that log10 RPK will be preferable to log10 relative abundance, but it should be visually obvious on the plot if the transformation is inappropriate for the model) and also provide overall species abundance (from metaphlan or the like). Both models have visualization tools outlined in the tutorial.

Hi @andrewGhazi

I tried to use anapan following the tutorial anpan,
on my data, but an error was found with anpan_gene_model,

anpan_res = anpan(bug_file = bug_path,
meta_file = meta_path,
out_dir = (“anpan_output”),
covariates = NULL,
outcome = “Treatment”,
model_type = “fastglm”,
discretize_inputs = TRUE)

(1/3) Preparing the mise en place (checking inputs)…
Error in basename(bug_file) : a character vector argument expected

any suggestions?

Thanks

I can’t tell what you have in your bug_path variable, but it’s not a path to a bug’s gene family file as intended. Read through the package vignette with anpan::anpan_vignette() to see an example.

thanks for your reply and suggestion,

it’s a gene families file, the output of Humann3 normalized to a relative abundance, using biobakery workflows.

regards,

bug_path needs to be a path to the file e.g. "/path/to/example_file.tsv" I assume you’re passing it to anpan as a matrix or data frame or something.

1 Like

Thanks a lot for the clarification,
yes it was a data frame.

Regards