I have been working on the development of statistical models for microbiome data analysis. Recently, I am developing a statistical model for omics data analysis with one of my students.
While looking for interesting datasets for our model’s illustration, I found the datasets used for your paper, Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases (nature, 2019). Especially, I am interested in the datasets of metagenomics and metatranscriptomics, which showed high association in the paper. I see merged tables available from https://ibdmdb.org/ for the metagenomics and metatranscriptomics datasets.
I see that your merged tables have relative abundances estimated by MetaPhlAn for the metagenomics data and RPKs for the metatranscriptomics data
I wonder if you have data in estimated or raw counts. I think how to normalize sample’s sequencing depth may affect final inferences. I found from my experience in analyzing 16S sequencing data. Also, our method dose model-based sample normalization, and the normalization prior to analysis is not required. Also, working with counts gives us more flexibility. Can I find count data from the website?