Zhejiang University, National Institutes for Food and Drug Control, Shanghai Changzheng Hospital. Expert Consensus on General Methods for Performance Evaluation of Artificial Intelligence Medical Devices (2023)[J]. Medical Journal of Peking Union Medical College Hospital, 2023, 14(3): 494-503. DOI: 10.12290/xhyxzz.2023-0137
Citation: Zhejiang University, National Institutes for Food and Drug Control, Shanghai Changzheng Hospital. Expert Consensus on General Methods for Performance Evaluation of Artificial Intelligence Medical Devices (2023)[J]. Medical Journal of Peking Union Medical College Hospital, 2023, 14(3): 494-503. DOI: 10.12290/xhyxzz.2023-0137

Expert Consensus on General Methods for Performance Evaluation of Artificial Intelligence Medical Devices (2023)

Funds: 

National Research and Development Program of China 2019YFC0118800

More Information
  • Corresponding authors: LIU Shiyuan, E-mail: cjr.liushiyuan@vip.163.com
    LI Jingli, E-mail: lijli@nifdc.org.cn
    WU Jian, E-mail: wujian2000@zju.edu.cn
    1. Department of Radiology, Shanghai Changzheng Hospital, Second Military Medical University, Shanghai 200003, China
    2. Medical Device Inspection Institute, National Institutes for Food and Drug Control, Beijing 102629, China
    3. School of Public Health, Zhejiang University, Hangzhou 310027, China

  • Received Date: March 18, 2023
  • Accepted Date: May 07, 2023
  • Available Online: May 15, 2023
  • Issue Publish Date: May 29, 2023
  • Artificial intelligence medical devices are rapidly evolving, and the performance evaluation methods of the products need to be standardized and innovated. With the goal of promoting industry, supporting supervision, and improving the quality of artificial intelligence medical device products, Zhejiang University, in cooperation with a number of professional institutions such as the National Institutes for Food and Drug Control, and relying on the centralized unit of artificial intelligence medical device standardization technology, led the efforts to analyze the common problems in performance evaluation and summarize related test methods of these devices. Based on the consensus of the expert group, this paper introduces various test methods and their applications in detail, and expounds the sampling of test data. The aim is to unify understanding, promote thestandardization of artificial intelligence medical device performance evaluation methods, and finally boost the high-quality development of artificial intelligence medical devices.
  • [1]
    Chen T, Liu X, Feng R, et al. Discriminative cervical lesion detection in colposcopic images with global class activation and local bin excitation[J]. IEEE J Biomed Health Inform, 2022, 26: 1411-1421. DOI: 10.1109/JBHI.2021.3100367
    [2]
    Lin Z, Guo R, Wang Y, et al. A framework for identifying diabetic retinopathy based on anti-noise detection and attention-based fusion[C]. International Conference on Medical Image Computing and Computer-Assisted Interven-tion. Springer, Cham, 2018: 74-82.
    [3]
    Chen J, Yu B, Lei B, et al. Doctor imitator: A graph-based bone age assessment framework using hand radiographs[C]. International Conference on Medical Image Comput-ing and Computer-Assisted Intervention. Springer, Cham, 2020: 764-774.
    [4]
    International Electrotechnical Commission. PWI 62-3 ED1: Artificial Intelligence/Machine Learning-enabled Medical Device-Performance Evaluation Process[EB/OL]. [2023-03-18]. https://www.iec.ch/ords/f?p=103:38:402197631962789::::FSP_ORG_ID,FSP_APEX_PAGE,FSP_PROJECT_ID:1245,23,107066.
    [5]
    International Electrotechnical Commission. PNW 62-411 ED1: Testing of Artificial Intelligence/Machine Learning-enabled Medical Devices[EB/OL]. [2023-03-18]. https://www.iec.ch/ords/f?p=103:38:402197631962789::::FSP_ORG_ID,FSP_APEX_PAGE,FSP_PROJECT_ID:1245,23,109273.
    [6]
    国家药品监督管理局. 人工智能医疗器械质量要求和评价第1部分: 术语YY/T 1833.1-2022[S]. 北京: 中国标准出版社. 2022.
    [7]
    国家药品监督管理局. 人工智能医疗器械质量要求和评价第2部分: 数据集通用要求YY/T 1833.2-2022[S]. 北京: 中国标准出版社. 2022.
    [8]
    国家药品监督管理局. 人工智能医疗器械质量要求和评价第3部分: 数据标注通用要求YY/T 1833.3-2022[S]. 北京: 中国标准出版社. 2022.
    [9]
    国家药品监督管理局. 人工智能医疗器械肺部影像辅助分析软件算法性能测试方法YY/T 1858-2022[S]. 北京: 中国标准出版社. 2022.
    [10]
    Huang X, Kwiatkowska M, Wang S, et al. Safety verification of deep neural networks[C]. Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24—28, 2017, Proceedings, Part Ⅰ 30. Springer International Publishing, 2017: 3-29.
    [11]
    Montano JJ, Palmer A. Numeric sensitivity analysis applied to feedforward neural networks[J]. Neural Comput Appl, 2003, 12: 119-125. DOI: 10.1007/s00521-003-0377-9
    [12]
    Bunel RR, Turkaslan I, Torr P, et al. A unified view of piecewise linear neural network verification[J/OL]. [2023-03-18]. https://arxiv.org/abs/1711.00455v2.
    [13]
    Tang S, Gong R, Wang Y, et al. Robustart: Bench-marking robustness on architecture design and training techniques[J/OL]. [2023-03-18]. https://arxiv.org/abs/2109.05211.
    [14]
    Tian Y, Pei K, Jana S, et al. Deeptest: Automated testing of deep-neural-network-driven autonomous cars[C]. Proceedings of the 40th international conference on software engineering, 2018: 303-314.
    [15]
    Singh G, Gehr T, Püschel M, et al. An abstract domain for certifying neural networks[EB/OL]. [2023-03-18]. https://www.sri.inf.ethz.ch/publications/singh2019domain.
    [16]
    Wang L, Wang H, Xia C, et al. Toward standardized premarket evaluation of computer aided diagnosis/detection products: insights from FDA-approved products[J]. Expert Rev Med Devices, 2020, 17: 899-918. DOI: 10.1080/17434440.2020.1813566
    [17]
    中华医学会放射学分会, 中国食品药品检定研究院, 国家卫生健康委能力建设与继续教育中心, 等. 胸部CT肺结节数据集构建及质量控制专家共识[J]. 中华放射学杂志, 2021, 55: 104-110.
    [18]
    陈耀龙, 罗旭飞. 临床实践指南的制订方法与步骤[J]. 中华传染病杂志, 2019, 37: 523-526. DOI: 10.3760/cma.j.issn.1000-6680.2019.09.003
    [19]
    陈耀龙, 罗旭飞, 王吉耀, 等. 如何区分临床实践指南与专家共识[J]. 协和医学杂志, 2019, 10: 403-408. DOI: 10.3969/j.issn.1674-9081.2019.04.018
    [20]
    北京协和医院罕见病多学科协作组, 中国罕见病联盟. 氯巴占治疗难治性癫痫专家共识(2022)[J]. 协和医学杂志, 2022, 13: 768-782. DOI: 10.12290/xhyxzz.2022-0421
    [21]
    BS PD ISO/IEC TR 29119-11: 2020, Software and systems engineering: Software testing— Part 11: Guidelines on the testing of AI-based systems[EB/OL]. [2023-03-18]. https://www.iso.org/obp/ui/#iso:std:iso-iec:tr:29119:-11:ed-1:v1:en.
    [22]
    Wang L, Wang H, Xia C, et al. Toward standardized premarket evaluation of computer aided diagnosis/detection products: insights from FDA-approved products[J]. Expert Rev Med Devices, 2020, 17: 899-918. DOI: 10.1080/17434440.2020.1813566
    [23]
    Wang H, Meng X, Zhang C, et al. Performance Assess-ment of Artificial Intelligence Medical Device Software Using Synthetic Data[C]. 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), 2021: 444-448.
    [24]
    Hess DE, Roddy RF, Faller W. Uncertainty analysis applied to feedforward neural networks[J]. Ship Technol Res, 2007, 54: 114-124. DOI: 10.1179/str.2007.54.3.003
    [25]
    Choi JY, Choi CH. Sensitivity analysis of multilayer perceptron with differentiable activation functions[J]. IEEE Trans Neural Netw, 1992, 3: 101-107. DOI: 10.1109/72.105422
    [26]
    IEEE. IEEE Recommended Practice for the Quality Management of Datasets for Medical Artificial Intelligence[J]. IEEE, 2022. doi: 10.1109/IEEESTD.2022.9812564.
  • Cited by

    Periodical cited type(7)

    1. 朱玉佳,沈华,温奥楠,高梓翔,秦庆钊,单珅瑶,李文博,傅湘玲,赵一姣,王勇. 三维颌面对称参考平面智能构建的深度学习算法. 北京大学学报(医学版). 2025(01): 113-120 .
    2. 王晓玲,范之劲,郭术廷. 医疗器械独立软件核查中对相关标准的思考. 中国医疗器械信息. 2024(11): 8-10+63 .
    3. 中华医学会血液学分会实验诊断学组. 人工智能辅助血细胞形态学检查的技术要求及其临床应用中国专家共识(2024年版). 中华血液学杂志. 2024(04): 330-338 .
    4. 萧毅,王培军,刘士远. 中国医学影像人工智能的过去、现在和未来. 中华放射学杂志. 2024(11): 1359-1364 .
    5. 陈丹,闵锐,方鹏骞. 三级医院医学装备管理能力评估指标体系构建. 中国卫生质量管理. 2024(11): 61-65 .
    6. 张楠,李静,张杰,杨炯,张政波,何昆仑. 智能化医疗设备测试方案探讨. 中国医疗器械杂志. 2024(06): 699-705 .
    7. 曾雪晴,夏斌,曹战强,马天宇,许忞頔,徐子能,白海龙,丁鹏,朱俊霞. 基于深度学习的儿童曲面体层X线片牙齿数目异常识别模型的研发. 中华口腔医学杂志. 2023(11): 1138-1144 .

    Other cited types(0)

Catalog

    Article Metrics

    Article views (1396) PDF downloads (1103) Cited by(7)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    x Close Forever Close