Missing values in metabolomics data

thibgo · February 9, 2024, 2:02pm

Hello,

I am trying to understand what is the cause of the missing data points in the metabolomics data.

Is it because the value was under the detection threshold of the instrument ?
Is there other known reason ?

Here is a plot of the number of missing values for every metabolites in the 4 metabolomics data matrices.

The authors of the publication
Application of Artificial Intelligence Modeling Technology Based on Multi-Omics in Noninvasive Diagnosis of Inflammatory Bowel Disease (Huang et al. 2021)
decided to replace any missing values by min/2 of the feature. But they do not justify this choice.

thibgo · February 19, 2024, 11:00am

A recent publication using the same dataset uses another processing strategy for missing metabolites abundance data : removing all the metabolites that contains missing data points.

Integration of multiview microbiome data for deciphering microbiome-metabolome-disease pathways. Fang et al., 2024
https://www.semanticscholar.org/paper/Integration-of-multiview-microbiome-data-for-Fang-Wang/ed03fe47f588bd288e8776ac1b9f908e311e6e0b

This choice is described in section 4.2 of the paper, page 17. It is not explicitly justified and there is no information on the reason of the existence of empty data points.
Therefore the data go from 81,868 metabolites to 143 (because they also restrict to the metabolites having a HMDB ID).

So I am still wondering what is the reason some data points are empty, since it would help choose an appropriate to process the metabolomics dataset.

Topic		Replies	Views
EC details iHMP_metabolomics IBDMDB	2	539	October 26, 2020
Diagnosis and Extent IBDMDB	5	494	April 20, 2021
M/z and retention time of all these metabolites IBDMDB	1	389	February 8, 2021
Questions about ibdmdb datasets IBDMDB	1	570	April 20, 2021
Recommanded parameters when using Maaslin2 for Metabolomic data MaAsLin	2	703	November 29, 2021

Missing values in metabolomics data

Related topics