Abstract:
Objective To build a diagnostic model of gastric cancer based on deep learning and evaluate the performance of the model.
Methods The pathological sections of patients diagnosed with normal gastric mucosa, chronic gastritis, high-grade intraepithelial neoplasia or gastric adenocarcinoma by endoscopic examination in Zhejiang Provincial People's Hospital from January 2015 to January 2020 were retrospectively selected. The pathology slides were scanned at ×20 magnification to generate whole slide images (WSIs). These WSIs were randomly divided into patch classification data set, slide classification training set and slide classification test set at a ratio of 2:2:1. After the lesion regions of the patch classification data set were annotated and the patches were selected, they were randomly divided into training set, test set and validation set at a ratio of 20:1:1. The deep learning model Efficientnet and ResNet were used to train and the convolutional neural network (CNN) model for cancer and non-cancer classification was constructed. Based on the patch classification test set and validation set, the performance of the model was evaluated. The results were evaluated by the patch classification accuracy and the area under the curve (AUC). This model was used for image stitching to generate the cancerous heat map of WSIs and extract the slide-level cancer and non-cancer classification features of the heat map. LightGBM slide-level classification algorithm were trained and evaluated, and the gastric cancer of WSIs were diagnosed and recognized. The results were evaluated by AUC, accuracy, sensitivity and specificity.
Results A total of 500 pathological sections of benign gastric diseases (normal gastric mucosa, chronic gastritis) and 500 pathological sections of gastric cancer (high-grade intraepithelial neoplasia and gastric adenocarcinoma) that met the inclusion and exclusion criteria were selected. The patch classification data set, slide classification training set and slide classification test set were 400, 400 and 200, respectively. The patch classification training set, test set, validation set were 402 000, 20 000, 20 000, respectively. CNN model based on Efficientnet-b1 network structure for patch classification in test set and validation set achieved the highest accuracytest set: 91.3% (95% CI: 88.2%-95.4%); validation set: 92.5%(95% CI: 89.0%-95.3%)and the highest AUCtest set: 0.95(95% CI: 0.93-0.98); validation set: 0.96(95% CI: 0.92-0.98). The AUC of the model based on LightGBM algorithm was 0.98(95% CI: 0.89-0.98), with accuracy of 88.0%(95% CI: 81.6%-94.3%), sensitivity 100%(95% CI: 88.0%-100%), and specificity 67.0%(95% CI: 57.0%-85.0%).
Conclusion The CNN diagnostic model based on the pathology slides of gastric biopsy can locate the cancerous tissues, classify patch-level and slide-level lesion natures accurately, identify gastric cancer accurately, which has the potential to improve the diagnosis efficiency.