The authors of the publication Application of Artificial Intelligence Modeling Technology Based on Multi-Omics in Noninvasive Diagnosis of Inflammatory Bowel Disease (Huang et al. 2021)
decided to replace any missing values by min/2 of the feature. But they do not justify this choice.
A recent publication using the same dataset uses another processing strategy for missing metabolites abundance data : removing all the metabolites that contains missing data points.
This choice is described in section 4.2 of the paper, page 17. It is not explicitly justified and there is no information on the reason of the existence of empty data points.
Therefore the data go from 81,868 metabolites to 143 (because they also restrict to the metabolites having a HMDB ID).
So I am still wondering what is the reason some data points are empty, since it would help choose an appropriate to process the metabolomics dataset.