Intelligent System

Development of an Open‐Access and Explainable Machine Learning Prediction System to Assess the Mortality and Recurrence Risk Factors of Clostridioides Difficile Infection Patients

Figure 1.tif

Workflow for machine learning model selection and deployment

Identifying Clostridioides difficile infection (CDI) patients at risk of mortality or recurrence will facilitate prevention, timely treatment and improve clinical outcomes. The aim of this paper is to establish an open-access web-based prediction system, which estimates CDI patients’ mortality and recurrence outcomes, and explains the machine learning prediction with patients’ characteristics. Prognostic models were developed using four various types of machine learning algorithms and statistical logistics regression model utilizing over 15,000 CDI patients from 41 hospitals in Hong Kong. The boosting-based machine learning algorithm Gradient Boosting Machine (Mortality AUC: 0.7878; Recurrence AUC: 0.7076) outperformed statistical models (Mortality AUC: 0.7573; Recurrence AUC: 0.6927) and other machine learning algorithms. As the difficulty to interpret complex machine learning results had limited their use in the medical area, we adopted Shapley additive explanations (SHAP) to identify which features are crucial to the machine learning models and associate them with clinical findings. SHAP analysis showed that older age, reduced albumin levels, higher creatinine levels, and higher white blood cell count are the most highly associated mortality features, which is consistent with existing clinical findings. The open-access prediction system for clinicians to assess and interpret the risk factors of CDI patients is now available at https://www.cdiml.care/.

Open-access prediction system for CDI patients: https://www.cdiml.care/

PUBLICATIONS

[1]  Y.L. Ng, C.K. Lo, K.H. Lee, X. Xie, T. N.Y. Kwong, M. Ip, L. Zhang, J. Yu, J. J.Y. Sung, W.K.K. Wu, S. H. Wong, K.W. Kwok, “Development of an open-access and explainable machine learning prediction system to assess the mortality and recurrence risk factors of Clostridioides difficile infection patients,” Advanced Intelligent Systems 3(1): 2000188, 2020  Detail