基于卷积神经网络的病理活检胃癌诊断模型

Gastric Cancer Diagnostic Model Based on Convolutional Neural Network

  • 摘要:
      目的  基于深度学习技术,建立胃活检病理切片胃癌诊断模型,并对模型的性能进行评价。
      方法  回顾性收集2015年1月—2020年1月浙江省人民医院胃活检诊断为正常胃黏膜、慢性胃炎、高级别上皮内瘤变和胃腺癌患者的病理切片。以20倍率扫描为全视野数字图像(whole slide image, WSI),并按2:2:1的比例随机分为图块分类数据集、切片分类训练集与切片分类测试集。对图块分类数据集病变区域进行标注、图块截取后,按20:1:1的比例随机分为训练集、测试集、验证集。基于Efficientnet和ResNet网络结构构建卷积神经网络(convolutional neural network,CNN)图块级癌与非癌分类模型,并以图块分类准确率、受试者操作特征曲线下面积(area under the curve, AUC)评价该模型的性能。基于此模型拼接获取整张WSI的癌变热力图,提取热力图中切片级癌与非癌分类特征,对LightGBM算法进行训练,最终完成整张胃癌活检切片的诊断与识别,其识别结果以AUC、准确率、灵敏度、特异度进行评价。
      结果  共入选符合纳入和排除标准的胃良性疾病(正常胃黏膜、慢性炎症)病理切片500张,胃癌(高级别上皮内瘤变、胃腺癌)病理切片500张。图块分类数据集、切片分类训练集与切片分类测试集WSI分别为400张、400张、200张。图块分类训练集、测试集、验证集图块分别为402 000个、20 000个、20 000个。以Efficientnet-b1网络结构建立的CNN模型对测试集、验证集图块分类的准确率测试集:91.3%(95% CI: 88.2%~95.4%);验证集:92.5%(95% CI: 89.0%~95.3%)、AUC测试集:0.95(95% CI: 0.93~0.98);验证集:0.96(95% CI: 0.92~0.98)均最高。基于LightGBM算法构建的模型识别整张切片为胃癌的AUC为0.98(95% CI: 0.89~0.98),准确率为88.0%(95% CI: 81.6%~94.3%),灵敏度为100%(95% CI: 88.0%~100%),特异度为67.0%(95% CI: 57.0%~85.0%)。
      结论  基于胃活检病理切片建立的CNN诊断模型可对癌变组织进行定位,实现图块级和切片级病变性质精确分类,准确识别胃癌,有望提高病理诊断效率。

     

    Abstract:
      Objective  To build a diagnostic model of gastric cancer based on deep learning and evaluate the performance of the model.
      Methods  The pathological sections of patients diagnosed with normal gastric mucosa, chronic gastritis, high-grade intraepithelial neoplasia or gastric adenocarcinoma by endoscopic examination in Zhejiang Provincial People's Hospital from January 2015 to January 2020 were retrospectively selected. The pathology slides were scanned at ×20 magnification to generate whole slide images (WSIs). These WSIs were randomly divided into patch classification data set, slide classification training set and slide classification test set at a ratio of 2:2:1. After the lesion regions of the patch classification data set were annotated and the patches were selected, they were randomly divided into training set, test set and validation set at a ratio of 20:1:1. The deep learning model Efficientnet and ResNet were used to train and the convolutional neural network (CNN) model for cancer and non-cancer classification was constructed. Based on the patch classification test set and validation set, the performance of the model was evaluated. The results were evaluated by the patch classification accuracy and the area under the curve (AUC). This model was used for image stitching to generate the cancerous heat map of WSIs and extract the slide-level cancer and non-cancer classification features of the heat map. LightGBM slide-level classification algorithm were trained and evaluated, and the gastric cancer of WSIs were diagnosed and recognized. The results were evaluated by AUC, accuracy, sensitivity and specificity.
      Results  A total of 500 pathological sections of benign gastric diseases (normal gastric mucosa, chronic gastritis) and 500 pathological sections of gastric cancer (high-grade intraepithelial neoplasia and gastric adenocarcinoma) that met the inclusion and exclusion criteria were selected. The patch classification data set, slide classification training set and slide classification test set were 400, 400 and 200, respectively. The patch classification training set, test set, validation set were 402 000, 20 000, 20 000, respectively. CNN model based on Efficientnet-b1 network structure for patch classification in test set and validation set achieved the highest accuracytest set: 91.3% (95% CI: 88.2%-95.4%); validation set: 92.5%(95% CI: 89.0%-95.3%)and the highest AUCtest set: 0.95(95% CI: 0.93-0.98); validation set: 0.96(95% CI: 0.92-0.98). The AUC of the model based on LightGBM algorithm was 0.98(95% CI: 0.89-0.98), with accuracy of 88.0%(95% CI: 81.6%-94.3%), sensitivity 100%(95% CI: 88.0%-100%), and specificity 67.0%(95% CI: 57.0%-85.0%).
      Conclusion  The CNN diagnostic model based on the pathology slides of gastric biopsy can locate the cancerous tissues, classify patch-level and slide-level lesion natures accurately, identify gastric cancer accurately, which has the potential to improve the diagnosis efficiency.

     

/

返回文章
返回