Abstract:
Objective To develop and validate a model for predicting intensive care unit (ICU) mortality risk in patients with malignant tumors complicated by acute respiratory failure (ARF) based on an explainable machine learning framework.
Methods Clinical data of patients with malignant tumors and ARF were extracted from the eICU Collaborative Research Database in the United States, including demographic characteristics, comorbidities, vital signs, laboratory test indicators, and major interventions within the first 24 hours after ICU admission. The study outcome was ICU death. Enrolled patients were randomly divided into a training set and a validation set at a ratio of 7:3. Predictor variables were selected using least absolute shrinkage and selection operator (LASSO) regression. Five machine learning algorithms-extreme gradient boosting (XGBoost), support vector machine (SVM), Logistic regression, multilayer perceptron (MLP), and C5.0 Decision Tree-were employed to construct predictive models. Model performance was evaluated based on the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and other metrics. The optimal model was further interpreted using the Shapley additive explanations (SHAP) algorithm.
Results A total of 3196 patients with malignant tumors complicated by ARF were included. The training set comprised 2, 261 patients and the validation set 935 patients; 683 patients died during ICU stay, while 2513 survived. LASSO regression ultimately selected 12 variables closely associated with patient ICU outcomes, including sepsis comorbidity, use of vasoactive drugs, and within the first 24 hours after ICU admission:minimum mean arterial pressure, maximum heart rate, maximum respiratory rate, minimum oxygen saturation, minimum serum bicarbonate, minimum blood urea nitrogen, maximum white blood cell count, maximum mean corpuscular volume, maximum serum potassium, and maximum blood glucose. After model evaluation, the XGBoost model demonstrated the best performance. The AUCs for predicting ICU mortality risk in the training and validation sets were 0.940 and 0.763, respectively; accuracy was 88.3% and 81.2%; sensitivity was 98.5% and 95.9%. Its predictive performance also remained optimal in sensitivity analyses. SHAP analysis indicated that the top five variables contributing to the models predictions were minimum oxygen saturation, minimum serum bicarbonate, minimum mean arterial pressure, use of vasoactive drugs, and maximum white blood cell count.
Conclusions This study successfully developed a mortality risk prediction model for ICU patients with malignant tumors complicated by ARF based on a large-scale dataset and performed explainability analysis. The model aids clinicians in early identification of high-risk patients and implementing individualized interventions.