可解释机器学习模型预测恶性肿瘤合并急性呼吸衰竭患者预后:基于美国eICU协作研究数据库

Explainable Machine Learning Model for Predicting Prognosis in Patients with Malignant Tumors Complicated by Acute Respiratory Failure:Based on the eICU Collaborative Research Database in the United States

  • 摘要: 目的 基于可解释的机器学习模型构建可预测恶性肿瘤合并急性呼吸衰竭(acute respiratory failure ARF)患者重症监护病房(intensive care unit,ICU)死亡风险模型,并对其性能进行验证。方法 检索美国急诊重症监护病房协作研究数据库,提取恶性肿瘤合并ARF患者的临床数据(包括人口统计学特征,合并症,转入ICU后首个24 h内的生命体征、实验室检测指标、重要干预措施),研究结局为ICU死亡。将入组患者按7:3的比例随机分为训练集和验证集。采用最小绝对收缩和选择算子回归筛选预测变量,并运用极端梯度提升(extreme gradient boosting,XGBoost)、支持向量机、Logistic回归、多层感知器和C5.0决策树5种机器学习算法构建预测模型。基于受试者工作特征曲线下面积(area under the curve,AUC)、准确率、灵敏度等指标评估模型性能,并采用Shapley加性解释(Shapley additive explanations,SHAP)算法对最优模型进行可解释性分析。结果 共纳入3196例恶性肿瘤合并ARF的患者。其中训练集2261例、验证集935例;转入ICU期间死亡683例,存活2513例。LASSO回归最终筛选出12个与患者ICU预后密切相关的变量,包括是否合并脓毒症,是否使用血管活性药物,转入ICU首个24 h内的平均动脉压最小值、心率最大值、呼吸频率最大值、血氧饱和度最小值、血碳酸氢盐最小值、血尿素氮最小值、白细胞计数最大值、红细胞平均体积最大值、血钾最大值及血糖最大值。经模型评价,XGBoost模型表现最佳。该模型在训练集和验证集中预测恶性肿瘤合并ARF患者ICU死亡风险的AUC分别为0.940和0.763,准确率分别为88.3%和81.2%,灵敏度分别为98.5%和95.9%,且在敏感性分析中其预测性能亦最优。SHAP分析显示,血氧饱和度最小值、血碳酸氢盐最小值、平均动脉压最小值、是否使用血管活性药物及白细胞计数最大值为对模型预测结果贡献居前5位的变量。结论 本研究基于大规模数据集成功构建了恶性肿瘤合并ARF患者ICU内死亡风险预测模型,并对其进行了可解释性分析,有助于临床医生早期识别高风险患者并进行个体化干预。

     

    Abstract: Objective To develop and validate a model for predicting intensive care unit (ICU) mortality risk in patients with malignant tumors complicated by acute respiratory failure (ARF) based on an explainable machine learning framework. Methods Clinical data of patients with malignant tumors and ARF were extracted from the eICU Collaborative Research Database in the United States, including demographic characteristics, comorbidities, vital signs, laboratory test indicators, and major interventions within the first 24 hours after ICU admission. The study outcome was ICU death. Enrolled patients were randomly divided into a training set and a validation set at a ratio of 7:3. Predictor variables were selected using least absolute shrinkage and selection operator (LASSO) regression. Five machine learning algorithms-extreme gradient boosting (XGBoost), support vector machine (SVM), Logistic regression, multilayer perceptron (MLP), and C5.0 Decision Tree-were employed to construct predictive models. Model performance was evaluated based on the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and other metrics. The optimal model was further interpreted using the Shapley additive explanations (SHAP) algorithm. Results A total of 3196 patients with malignant tumors complicated by ARF were included. The training set comprised 2, 261 patients and the validation set 935 patients; 683 patients died during ICU stay, while 2513 survived. LASSO regression ultimately selected 12 variables closely associated with patient ICU outcomes, including sepsis comorbidity, use of vasoactive drugs, and within the first 24 hours after ICU admission:minimum mean arterial pressure, maximum heart rate, maximum respiratory rate, minimum oxygen saturation, minimum serum bicarbonate, minimum blood urea nitrogen, maximum white blood cell count, maximum mean corpuscular volume, maximum serum potassium, and maximum blood glucose. After model evaluation, the XGBoost model demonstrated the best performance. The AUCs for predicting ICU mortality risk in the training and validation sets were 0.940 and 0.763, respectively; accuracy was 88.3% and 81.2%; sensitivity was 98.5% and 95.9%. Its predictive performance also remained optimal in sensitivity analyses. SHAP analysis indicated that the top five variables contributing to the model􀆳s predictions were minimum oxygen saturation, minimum serum bicarbonate, minimum mean arterial pressure, use of vasoactive drugs, and maximum white blood cell count. Conclusions This study successfully developed a mortality risk prediction model for ICU patients with malignant tumors complicated by ARF based on a large-scale dataset and performed explainability analysis. The model aids clinicians in early identification of high-risk patients and implementing individualized interventions.

     

/

返回文章
返回