LUO Xufei, LYU Han, SONG Zaiwei, LIU Hui, WANG Zhixiang, LI Haodong, WANG Ye, ZHU Di, ZHANG Lu, CHEN Yaolong. The Impact of Generative Artificial Intelligence on the Development, Evaluation, and Application of Clinical Practice Guidelines[J]. Medical Journal of Peking Union Medical College Hospital, 2024, 15(5): 1173-1181. DOI: 10.12290/xhyxzz.2024-0602
Citation: LUO Xufei, LYU Han, SONG Zaiwei, LIU Hui, WANG Zhixiang, LI Haodong, WANG Ye, ZHU Di, ZHANG Lu, CHEN Yaolong. The Impact of Generative Artificial Intelligence on the Development, Evaluation, and Application of Clinical Practice Guidelines[J]. Medical Journal of Peking Union Medical College Hospital, 2024, 15(5): 1173-1181. DOI: 10.12290/xhyxzz.2024-0602

The Impact of Generative Artificial Intelligence on the Development, Evaluation, and Application of Clinical Practice Guidelines

Funds: 

CAMS Innovation Fund for Medical Sciences-Research Unit of Evidence-based Evaluation and Guidelines 2021RU017

More Information
  • Corresponding author:

    CHEN Yaolong, E-mail: chevidence@lzu.edu.cn

    ZHANG Lu, E-mail: ericluzhang@hkbu.edu.hk

  • Received Date: August 05, 2024
  • Accepted Date: September 13, 2024
  • Available Online: September 20, 2024
  • Publish Date: September 19, 2024
  • Issue Publish Date: September 29, 2024
  • Generative artificial intelligence (GAI) refers to AI technology capable of generating new content such as text, images, or audio from training data. GAI tools not only demonstrate rapid and efficient potential in literature screening, data extraction, and literature appraisal in systematic reviews, but can also be used for guideline evaluation and dissemination, enhancing the readability and promotion efficiency of guidelines. However, the accuracy of content generated by GAI tools, the rationality of cited evidence, the level of evidence, and the reliability of data still need verification. Additionally, data privacy protection and ethical issues are challenges that need to be addressed. This article aims to overview the current status of GAI tools in the formulation, evaluation, dissemination, and implementation of guidelines, explore the feasibility and new models of GAI tools in the field of guidelines, and improve the efficiency and quality of guideline formulation to better serve guideline developers and users.

  • [1]
    Feuerriegel S, Hartmann J, Janiesch C, et al. Generative AI[J]. Bus Inf Syst Eng, 2024, 66(1): 111-126. DOI: 10.1007/s12599-023-00834-7
    [2]
    Ho R A, Shaari A L, Cowan P T, et al. ChatGPT responses to frequently asked questions on Ménière's disease: a comparison to clinical practice guideline answers[J]. OTO Open, 2024, 8(3): e163. DOI: 10.1002/oto2.163
    [3]
    Hoang T, Liou L, Rosenberg A M, et al. An analysis of ChatGPT recommendations for the diagnosis and treatment of cervical radiculopathy[J]. J Neurosurg Spine, 2024, 41(3): 385-395.
    [4]
    Gomez-Cabello C A, Borna S, Pressman S M, et al. Artificial intelligence in postoperative care: assessing large language models for patient recommendations in plastic surgery[J]. Healthcare (Basel), 2024, 12(11): 1083.
    [5]
    Shiraishi M, Tomioka Y, Miyakuni A, et al. Performance of ChatGPT in answering clinical questions on the practical guideline of blepharoptosis[J]. Aesthetic Plast Surg, 2024, 48(13): 2389-2398. DOI: 10.1007/s00266-024-04005-1
    [6]
    Altintaş E, Ozkent M S, Gül M, et al. Comparative analysis of artificial intelligence chatbot recommendations for urolithiasis management: a study of EAU guideline compliance[J]. Fr J Urol, 2024, 34(7/8): 102666.
    [7]
    Piazza D, Martorana F, Curaba A, et al. The consistency and quality of ChatGPT responses compared to clinical guidelines for ovarian cancer: a Delphi approach[J]. Curr Oncol, 2024, 31(5): 2796-2804. DOI: 10.3390/curroncol31050212
    [8]
    Barlas T, Altinova A E, Akturk M, et al. Credibility of ChatGPT in the assessment of obesity in type 2 diabetes according to the guidelines[J]. Int J Obes (Lond), 2024, 48(2): 271-275. DOI: 10.1038/s41366-023-01410-5
    [9]
    Sciberras M, Farrugia Y, Gordon H, et al. Accuracy of information given by ChatGPT for patients with inflammatory bowel disease in relation to ECCO guidelines[J]. J Crohns Colitis, 2024, 18(8): 1215-1221. DOI: 10.1093/ecco-jcc/jjae040
    [10]
    Shrestha N, Shen Z K, Zaidat B, et al. Performance of ChatGPT on NASS clinical guidelines for the diagnosis and treatment of low back pain: a comparison study[J]. Spine (Phila Pa 1976), 2024, 49(9): 640-651. DOI: 10.1097/BRS.0000000000004915
    [11]
    Kusunose K, Kashima S, Sata M. Evaluation of the accuracy of ChatGPT in answering clinical questions on the Japanese society of hypertension guidelines[J]. Circ J, 2023, 87(7): 1030-1033. DOI: 10.1253/circj.CJ-23-0308
    [12]
    Institute of Medicine. Clinical practice guidelines we can trust[M]. Washington, D.C. : The National Academies Press, 2011.
    [13]
    World Health Organization. WHO handbook for guideline development[M]. 2nd ed. Geneva: World Health Organization, 2014.
    [14]
    中华医学会杂志社指南与标准研究中心, 中国医学科学院循证评价与指南研究创新单元(2021RU017), 世界卫生组织指南实施与知识转化合作中心, 等. 2022年医学期刊发表中国指南和共识的科学性、透明性和适用性的评级[J]. 中华医学杂志, 2023, 103(37): 2912-2920. DOI: 10.3760/cma.j.cn112137-20230724-00076

    Guidelines and Standards Research Center Chinese Medical Association Publishing House, Research Unit of Evidence-Based Evaluation and Guidelines (2021RU017), Chinese Academy of Medical Sciences, WHO Collaborating Centre for Guideline Implementation and Knowledge Translation, et al. Evaluation and ranking for scientific, transparent and applicable of Chinese guidelines and consensus published in the medical journals in 2022[J]. Natl Med J China, 2023, 103(37): 2912-2920. DOI: 10.3760/cma.j.cn112137-20230724-00076
    [15]
    吕萌, 罗旭飞, 刘云兰, 等. 2019年期刊公开发表的中国临床实践指南文献调查与评价: 传播与实施情况[J]. 协和医学杂志, 2022, 13(4): 673-678. DOI: 10.12290/xhyxzz.2022-0028

    Lyu M, Luo X F, Liu Y L, et al. Investigation and evaluation of Chinese clinical practice guidelines published in medical journals in 2019: analysis on dissemination and implementation[J]. Med J PUMCH, 2022, 13(4): 673-678. DOI: 10.12290/xhyxzz.2022-0028
    [16]
    陈耀龙, 罗旭飞, 史乾灵, 等. 人工智能如何改变指南的未来[J]. 协和医学杂志, 2021, 12(1): 114-121. DOI: 10.12290/xhyxzz.2021-0012

    Chen Y L, Luo X F, Shi Q L, et al. How will artificial intelligence lead the future of clinical practice guidelines[J]. Med J PUMCH, 2021, 12(1): 114-121. DOI: 10.12290/xhyxzz.2021-0012
    [17]
    Luo X F, Chen F X, Zhu D, et al. Potential roles of large language models in the production of systematic reviews and meta-analyses[J]. J Med Internet Res, 2024, 26: e56780. DOI: 10.2196/56780
    [18]
    Oami T, Okada Y, Nakada T A. Performance of a large language model in screening citations[J]. JAMA Netw Open, 2024, 7(7): e2420496. DOI: 10.1001/jamanetworkopen.2024.20496
    [19]
    Khraisha Q, Put S, Kappenberg J, et al. Can large language models replace humans in systematic reviews? Evaluating GPT-4's efficacy in screening and extracting data from peer-reviewed and grey literature in multiple languages[J]. Res Synth Methods, 2024, 15(4): 616-626. DOI: 10.1002/jrsm.1715
    [20]
    Gwon Y N, Kim J H, Chung H S, et al. The use of generative AI for scientific literature searches for systematic reviews: ChatGPT and Microsoft Bing AI performance evaluation[J]. JMIR Med Inform, 2024, 12: e51187. DOI: 10.2196/51187
    [21]
    Hossain M M. Using ChatGPT and other forms of generative AI in systematic reviews: Challenges and opportunities[J]. J Med Imaging Radiat Sci, 2024, 55(1): 11-12. DOI: 10.1016/j.jmir.2023.11.005
    [22]
    Issaiy M, Ghanaati H, Kolahi S, et al. Methodological insights into ChatGPT's screening performance in systematic reviews[J]. BMC Med Res Methodol, 2024, 24(1): 78. DOI: 10.1186/s12874-024-02203-8
    [23]
    Gartlehner G, Kahwati L, Hilscher R, et al. Data extraction for evidence synthesis using a large language model: a proof-of-concept study[J]. Res Synth Methods, 2024, 15(4): 576-589. DOI: 10.1002/jrsm.1710
    [24]
    Kohandel Gargari O, Mahmoudi M H, Hajisafarali M, et al. Enhancing title and abstract screening for systematic reviews with GPT-3.5 turbo[J]. BMJ Evid Based Med, 2024, 29(1): 69-70. DOI: 10.1136/bmjebm-2023-112678
    [25]
    Guo E, Gupta M, Deng J W, et al. Automated paper screening for clinical reviews using large language models: data analysis study[J]. J Med Internet Res, 2024, 26: e48996. DOI: 10.2196/48996
    [26]
    Giunti G, Doherty C P. Cocreating an automated mHealth apps systematic review process with generative AI: design science research approach[J]. JMIR Med Educ, 2024, 10: e48949. DOI: 10.2196/48949
    [27]
    Dennstädt F, Zink J, Putora P M, et al. Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain[J]. Syst Rev, 2024, 13(1): 158. DOI: 10.1186/s13643-024-02575-4
    [28]
    Tran V T, Gartlehner G, Yaacoub S, et al. Sensitivity and specificity of using GPT-3.5 turbo models for title and abstract screening in systematic reviews and meta-analyses[J]. Ann Intern Med, 2024, 177(6): 791-799.
    [29]
    Qureshi R, Shaughnessy D, Gill K A R, et al. Are ChatGPT and large language models "the answer" to bringing us closer to systematic review automation? [J]. Syst Rev, 2023, 12(1): 72. DOI: 10.1186/s13643-023-02243-z
    [30]
    Mahuli S A, Rai A, Mahuli A V, et al. Application ChatGPT in conducting systematic reviews and meta-analyses[J]. Br Dent J, 2023, 235(2): 90-92. DOI: 10.1038/s41415-023-6132-y
    [31]
    Nashwan A J, Jaradat J H. Streamlining systematic reviews: harnessing large language models for quality assessment and risk-of-bias evaluation[J]. Cureus, 2023, 15(8): e43023.
    [32]
    Alshami A, Elsayed M, Ali E, et al. Harnessing the power of ChatGPT for automating systematic review process: methodology, case study, limitations, and future directions[J]. Systems, 2023, 11(7): 351. DOI: 10.3390/systems11070351
    [33]
    Kataoka Y, So R, Banno M, et al. Development of meta-prompts for Large Language Models to screen titles and abstracts for diagnostic test accuracy reviews[DB/OL]. (2023-11-01)[2024-07-19]. https://doi.org/10.1101/2023.10.31.23297818.
    [34]
    Trillo J R, Cabrerizo F J, Pérez I J, et al. A new consensus reaching method for group decision-making based on the large language model Gemini for detecting hostility during the discussion process[C]//2024 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS). Piscataway, NJ: IEEE Press, 2024: 1-8.
    [35]
    Maniaci A, Saibene A M, Calvo-Henriquez C, et al. Is generative pre-trained transformer artificial intelligence (Chat-GPT) a reliable tool for guidelines synthesis? A preliminary evaluation for biologic CRSwNP therapy[J]. Eur Arch Otorhinolaryngol, 2024, 281(4): 2167-2173. DOI: 10.1007/s00405-024-08464-9
    [36]
    Mugaanyi J, Cai L Y, Cheng S M, et al. Evaluation of large language model performance and reliability for citations and references in scholarly writing: cross-disciplinary study[J]. J Med Internet Res, 2024, 26: e52935. DOI: 10.2196/52935
    [37]
    Teperikidis E, Boulmpou A, Papadopoulos C. Prompting ChatGPT to perform an umbrella review[J]. Acta Cardiol, 2024, 79(3): 403-404. DOI: 10.1080/00015385.2023.2240120
    [38]
    Brouwers M C, Kho M E, Browman G P, et al. AGREE Ⅱ: advancing guideline development, reporting and evaluation in health care[J]. CMAJ, 2010, 182(18): E839-E842. DOI: 10.1503/cmaj.090449
    [39]
    Chen Y L, Yang K H, Marušic A, et al. A reporting tool for practice guidelines in health care: the RIGHT statement[J]. Ann Intern Med, 2017, 166(2): 128-132. DOI: 10.7326/M16-1565
    [40]
    杨楠, 赵巍, 潘旸, 等. 针对临床实践指南科学性、透明性和适用性的评级工具研发[J]. 中华医学杂志, 2022, 102(30): 2329-2337. DOI: 10.3760/cma.j.cn112137-20220219-00340

    Yang N, Zhao W, Pan Y, et al. Development of a rating tool for the scientificity, transparency and applicability of clinical practice guidelines[J]. Natl Med J China, 2022, 102(30): 2329-2337. DOI: 10.3760/cma.j.cn112137-20220219-00340
    [41]
    李思雨, 刁莎, 石雨晴, 等. 指南临床适用性评价工具(2.0版)[J]. 中国循证医学杂志, 2023, 23(5): 597-601.

    Li S Y, Diao S, Shi Y Q, et al. Establishment of the instrument for evaluating clinical applicability of guidelines (version 2.0)[J]. Chin J Evid-Based Med, 2023, 23(5): 597-601.
    [42]
    YesChat AI. AGREE Ⅱ analyzer-AI-powered analysis[EB/OL]. [2024-07-19]. https://www.yeschat.ai/gpts-9t557aqyOyl-AGREE-Ⅱ-Analyzer.
    [43]
    Lai H H, Ge L, Sun M Y, et al. Assessing the risk of bias in randomized clinical trials with large language models[J]. JAMA Netw Open, 2024, 7(5): e2412687. DOI: 10.1001/jamanetworkopen.2024.12687
    [44]
    Roberts R H, Ali S R, Hutchings H A, et al. Comparative study of ChatGPT and human evaluators on the assessment of medical literature according to recognised reporting standards[J]. BMJ Health Care Inform, 2023, 30(1): e100830. DOI: 10.1136/bmjhci-2023-100830
    [45]
    Woelfle T, Hirt J, Janiaud P, et al. Benchmarking human-AI collaboration for common evidence appraisal tools[DB/OL ]. (2024-04-22)[2024-07-19]. https://doi.org/10.1101/2024.04.21.24306137.
    [46]
    刘辉, 杨楠, 史乾灵, 等. 医学期刊发表中国指南和共识类文献科学性、透明性和适用性评级方法学: 样本确定及专科分配[J]. 协和医学杂志, 2024, 15(2): 429-434. DOI: 10.12290/xhyxzz.2024-0112

    Liu H, Yang N, Shi Q L, et al. Methodology of scientific, transparent and applicable rankings for Chinese guidelines and consensus related literature published in the medical journals: sample identification and speciality assignment[J]. Med J PUMCH, 2024, 15(2): 429-434. DOI: 10.12290/xhyxzz.2024-0112
    [47]
    张志玲, 周鹏翔, 何娜, 等. 基于临床实践指南, 应用生成式人工智能模型编写纤维肌痛患者教育材料[J]. 临床药物治疗杂志, 2024, 22(5): 7-11. DOI: 10.3969/j.issn.1672-3384.2024.05.002

    Zhang Z L, Zhou P X, He N, et al. Applying generative artificial intelligence models based on clinical practice guidelines to develop educational materials for fibromyalgia patients[J]. Clin Med J, 2024, 22(5): 7-11. DOI: 10.3969/j.issn.1672-3384.2024.05.002
    [48]
    Kresevic S, Giuffrè M, Ajcevic M, et al. Optimization of hepatological clinical guidelines interpretation by large language models: a retrieval augmented generation-based framework[J]. NPJ Digit Med, 2024, 7(1): 102. DOI: 10.1038/s41746-024-01091-y
    [49]
    Hamed E, Eid A, Alberry M. Exploring ChatGPT's potential in facilitating adaptation of clinical guidelines: a case study of diabetic ketoacidosis guidelines[J]. Cureus, 2023, 15(5): e38784.
    [50]
    Miao B Y, Almaraz E R, Ganjouei A A, et al. Generation of guideline-based clinical decision trees in oncology using large language models[DB/OL ]. (2024-03-06)[2024-07-19]. https://doi.org/10.1101/2024.03.04.24303737.
    [51]
    Wang Y S, Visweswaran S, Kapoor S, et al. ChatGPT-CARE: a superior decision support tool enhancing ChatGPT with clinical practice guidelines[DB/OL]. (2024-03-06) [2024-07-19]. https://doi.org/10.1101/2023.08.09.23293890.
  • Related Articles

    [1]XUE Jionghao, LI Zhipeng, ZHAO Yuanli. The Technological Advances and Prospects of Vascularized Brain Organoids[J]. Medical Journal of Peking Union Medical College Hospital, 2025, 16(2): 277-284. DOI: 10.12290/xhyxzz.2025-0056
    [2]WANG Xianze, PING Lu, WU Wenming. Deriving New Ideas for the Diagnosis and Treatment of Pancreatic Neuroendocrine Neoplasms from Basic Research[J]. Medical Journal of Peking Union Medical College Hospital, 2024, 15(4): 734-739. DOI: 10.12290/xhyxzz.2024-0386
    [3]QU Yang, ZHANG Yanna, ZHOU Yidong, SUN Qiang. Application of 21-gene Recurrence Score in Hormone Receptor Positive Breast Cancer Patients[J]. Medical Journal of Peking Union Medical College Hospital, 2023, 14(6): 1274-1281. DOI: 10.12290/xhyxzz.2023-0226
    [4]MA Mingsheng, SONG Hongmei. Acceleration of Precision Medicine in Pediatric Rheumatic and Immunologic Diseases[J]. Medical Journal of Peking Union Medical College Hospital, 2023, 14(2): 229-233. DOI: 10.12290/xhyxzz.2023-0080
    [5]WANG Guochang, ZHU Zhaohui. Molecular Imaging-guided Precise Theranostics[J]. Medical Journal of Peking Union Medical College Hospital, 2022, 13(2): 165-168. DOI: 10.12290/xhyxzz.2021-0773
    [6]CHEN Wen, ZHOU Zhou. Application of Genetic Testing in Precision Medicine for Coronary Heart Disease[J]. Medical Journal of Peking Union Medical College Hospital, 2021, 12(4): 445-449. DOI: 10.12290/xhyxzz.2021-0418
    [7]LIU Yan-hong, YE Qing. The Characteristics of the Construction and Development of Biobank in the Era of Precision Medicine[J]. Medical Journal of Peking Union Medical College Hospital, 2021, 12(2): 254-259. DOI: 10.3969/j.issn.1674-9081.2020.00.008
    [8]Yue-kun WANG, Peng-hao LIU, Yu WANG, Wen-bin MA. Single-cell Sequencing and Its Prospect in the Management of Brain Malignant Tumor[J]. Medical Journal of Peking Union Medical College Hospital, 2020, 11(5): 606-614. DOI: 10.3969/j.issn.1674-9081.2020.05.018
    [9]Wei WU, Yi-ning WANG, Jing-wen DAI, Zheng-yu JIN, Shu-yang ZHANG. New Clinical Diagnostic Pathway of Primary Cardiomyopathy: from High-resolution Imaging to Molecular Precision Medicine[J]. Medical Journal of Peking Union Medical College Hospital, 2019, 10(1): 6-10. DOI: 10.3969/j.issn.1674-9081.2019.01.002
    [10]Bo ZHANG. Development of Pathology in Coming Era of Precision Medicine[J]. Medical Journal of Peking Union Medical College Hospital, 2017, 8(2-3): 117-121. DOI: 10.3969/j.issn.1674-9081.2017.03.007
  • Cited by

    Periodical cited type(2)

    1. 陈玉莹,邓小丽,李玥璐,王丽婷. 慢性阻塞性肺疾病运动恐惧研究进展. 全科护理. 2024(16): 3023-3026 .
    2. 郑蔓,李文婷,岳可焕,张琳琳,高雅萍,翟燕. 严肃游戏在老年慢性疼痛患者中的应用研究进展. 中国护理管理. 2024(12): 1841-1845 .

    Other cited types(0)

Catalog

    Article Metrics

    Article views PDF downloads Cited by(2)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return
    x Close Forever Close