留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

医疗大数据的“欺骗性”及其对策

姜会珍 马琏 朱卫国

姜会珍, 马琏, 朱卫国. 医疗大数据的“欺骗性”及其对策[J]. 协和医学杂志, 2020, 11(5): 542-546. doi: 10.3969/j.issn.1674-9081.2020.05.009
引用本文: 姜会珍, 马琏, 朱卫国. 医疗大数据的“欺骗性”及其对策[J]. 协和医学杂志, 2020, 11(5): 542-546. doi: 10.3969/j.issn.1674-9081.2020.05.009
Hui-zhen JIANG, Lian MA, Wei-guo ZHU. Medical Big Data 'Deception' and Strategies[J]. Medical Journal of Peking Union Medical College Hospital, 2020, 11(5): 542-546. doi: 10.3969/j.issn.1674-9081.2020.05.009
Citation: Hui-zhen JIANG, Lian MA, Wei-guo ZHU. Medical Big Data "Deception" and Strategies[J]. Medical Journal of Peking Union Medical College Hospital, 2020, 11(5): 542-546. doi: 10.3969/j.issn.1674-9081.2020.05.009

医疗大数据的“欺骗性”及其对策

doi: 10.3969/j.issn.1674-9081.2020.05.009
基金项目: 

国家重点研发计划 2018YFC0116905

中国医学科学院医学与健康科技创新工程 2016-I2M-2-004

美国中华医学基金会公开竞标项目(CMB-OC) 16-258

详细信息
    通讯作者:

    朱卫国  电话:010-69154149,E-mail: zhuwg@pumch.cn

  • 中图分类号: R19-0; R195.4; C811

Medical Big Data "Deception" and Strategies

More Information
    Corresponding author: ZHU Wei-guo  Tel: 86-10-69154149, E-mail: zhuwg@pumch.cn
  • 摘要: 当前,针对医疗大数据的研究和应用越来越广泛,但毋庸置疑,医疗大数据本身具有一定欺骗性,在某些特殊场景下,可能会产生错误的结论和影响。本文从数据本身的欺骗性以及机器学习可能存在的陷阱展开,对医疗大数据产生欺骗性的原因进行分析;针对医疗大数据的欺骗性,从统计学角度阐述如何避免大数据陷阱;从模型角度分析模型被攻击的应对策略以及模型可解释性在医疗领域的重要性和方法。
    利益冲突:无
    作者贡献:姜会珍、朱卫国提供论文思路;姜会珍撰写论文;朱卫国、马琏修改论文。
  • 图  1  医疗大数据研究过程

  • [1] Lee CH, Yoon HJ. Medical big data: promise and challenges[J]. Kidney Res Clin Pract, 2017, 36:3-11. doi:  10.23876/j.krcp.2017.36.1.3
    [2] Price WN, Cohen IG. Privacy in the age of medical big data[J]. Nat Med, 2019, 25:37-43. doi:  10.1038/s41591-018-0272-7
    [3] Chan MK, Cooper JD, Bahn S. Commercialisation of biomarker tests for mental illnesses: advances and obstacles[J]. Trends Biotechnol, 2015, 33:712-723. doi:  10.1016/j.tibtech.2015.09.010
    [4] Ranstam J, Buyse M, George SL, et al. Fraud in medical research: an international survey of biostatisticians[J]. Controll Clini Trials, 2000, 21:415-427. doi:  10.1016/S0197-2456(00)00069-6
    [5] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[J]. arXiv preprint, 2014, 1412.6572. https://arxiv.org/abs/1412.6572
    [6] Erickson BJ, Korfiatis P, Akkus Z, et al. Machine learning for medical imaging[J]. Radiographics, 2017, 37:505-515. doi:  10.1148/rg.2017160130
    [7] Dallachiesa M, Ebaid A, Eldawy A, et al. NADEEF: a commodity data cleaning system[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ACM, 2013: 541-552. https://www.researchgate.net/publication/266653693_NADEEF_a_commodity_data_cleaning_system?_sg=DCFqDNvnXTKkHNIIbpCv_Ikp_w7YpyKUxtygQXlLca_k3v6yNoNnvNZ3sfyEzKJXiiYqNOa_HI9MUWUfM7xPkw
    [8] Rahm E, Do HH. Data cleaning: Problems and current approaches[J]. IEEE Data Eng, Bull, 2000, 23:3-13. http://ci.nii.ac.jp/naid/10018221721
    [9] Pauleen DJ, Rooney D, Intezari A. Big data, little wisdom: trouble brewing? Ethical implications for the information systems discipline[J]. Soc Epistemol, 2017, 31:400-416. doi:  10.1080/02691728.2016.1249436
    [10] McCaul ME, Wand GS. Detecting deception in our research participants: are your participants who you think they are?[J]. Alcoholism Clin Exp Res, 2018, 42:230-237. doi:  10.1111/acer.13556
    [11] Rohrer JM. Thinking clearly about correlations and causation: Graphical causal models for observational data[J]. Advances in Methods and Practices in Psychological Science (AMPPS), 2018, 1:27-42. doi:  10.1177/2515245917745629
    [12] Simon HA. Spurious correlation: A causal interpretation[J]. J Am Stat Assoc, 1954, 49:467-479. doi:  10.1007%2F978-94-010-9521-1_7
    [13] Wilson N, Mason K, Tobias M, et al. Interpreting "Google Flu Trends" data for pandemic H1N1 influenza: the New Zealand experience[J]. Euro Surveill, 2009, 14:19386. http://europepmc.org/abstract/MED/19941777
    [14] Lazer D, Kennedy R, King G, et al. Big data. The parable of Google Flu: traps in big data analysis[J]. Science, 2014, 343:1203-1205. doi:  10.1126/science.1248506
    [15] Butler D. When Google got flu wrong[J]. Nature, 2013, 494:155. doi:  10.1038/494155a
    [16] Taleb NN. The black swan: the impact of the highly improbable[M]. New York:Random house, 2007.
    [17] Siuly S, Zhang Y. Medical Big Data: Neurological Diseases Diagnosis Through Medical Data Analysis[J]. Data Science and Engineering (DSE), 2016, 1:54-64. doi:  10.1007/s41019-016-0011-3
    [18] Batrouni M, Bertaux A, Nicolle C. Scenario analysis, from BigData to black swan[J]. Comput Sci Rev, 2018, 28:131-139. doi:  10.1016/j.cosrev.2018.02.001
    [19] Doornik JA, Hendry DF. Statistical model selection with "Big Data "[J]. Cogent Economics & Finance, 2015, 3:1045216. doi:  10.1080/23322039.2015.1045216
    [20] Elsayed G, Shankar S, Cheung B, et al. Adversarial examples that fool both computer vision and time-limited humans[C]//Advances in Neural Information Processing Systems, 2018: 3910-3920. doi:  10.5555/3327144.3327306
    [21] Feinman R, Curtin RR, Shintre S, et al. Detecting Adversarial Samples from Artifacts[J]. arXiv preprint, 2017, 1703.00410. https://www.researchgate.net/publication/314153095_Detecting_Adversarial_Samples_from_Artifacts
    [22] Lipton ZC. The mythos of model interpretability[J]. Queue, 2018, 16:31-57. http://portal.acm.org/citation.cfm?id=3241340
    [23] Poursabzi-Sangdeh F, Goldstein DG, Hofman JM, et al. Manipulating and Measuring Model Interpretability[J]. arXiv preprint, 2018, 1802.07810. https://www.researchgate.net/publication/323355908_Manipulating_and_Measuring_Model_Interpretability
  • 加载中
图(1)
计量
  • 文章访问数:  112
  • HTML全文浏览量:  22
  • PDF下载量:  26
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-07-15
  • 刊出日期:  2020-09-30

目录

    /

    返回文章
    返回

    【温馨提醒】近日,《协和医学杂志》编辑部接到作者反映,有多名不法人员冒充期刊编辑发送见刊通知,鼓动作者添加微信,从而骗取版面费的行为。特提醒您,本刊与作者联系的方式均为邮件通知或电话,稿件进度通知邮箱为:mjpumch@126.com,编辑部电话为:010-69154261,请提高警惕,谨防上当受骗!如有任何疑问,请致电编辑部核实。谢谢!