Volume 11 Issue 5
Sep.  2020
Turn off MathJax
Article Contents
Hui-zhen JIANG, Lian MA, Wei-guo ZHU. Medical Big Data 'Deception' and Strategies[J]. Medical Journal of Peking Union Medical College Hospital, 2020, 11(5): 542-546. doi: 10.3969/j.issn.1674-9081.2020.05.009
Citation: Hui-zhen JIANG, Lian MA, Wei-guo ZHU. Medical Big Data "Deception" and Strategies[J]. Medical Journal of Peking Union Medical College Hospital, 2020, 11(5): 542-546. doi: 10.3969/j.issn.1674-9081.2020.05.009

Medical Big Data "Deception" and Strategies

doi: 10.3969/j.issn.1674-9081.2020.05.009
More Information
  • Corresponding author: ZHU Wei-guo  Tel: 86-10-69154149, E-mail: zhuwg@pumch.cn
  • Received Date: 2019-07-15
  • Publish Date: 2020-09-30
  • At present, research and application of medical big data are more and more extensive. But inevitably, medical big data is of some deception, and in many scenarios, it can result in wrong conclusions and influence. In this paper, firstly we analyze the causes of medical big data deception from the data deception per se and pitfalls of machine learning. Then, we introduce how to avoid data pitfalls in statistics and analyze the strategies to tackle attacks on models. The importance and methods achieving model interpretability in the medical area are also mentioned.
  • loading
  • [1] Lee CH, Yoon HJ. Medical big data: promise and challenges[J]. Kidney Res Clin Pract, 2017, 36:3-11. doi:  10.23876/j.krcp.2017.36.1.3
    [2] Price WN, Cohen IG. Privacy in the age of medical big data[J]. Nat Med, 2019, 25:37-43. doi:  10.1038/s41591-018-0272-7
    [3] Chan MK, Cooper JD, Bahn S. Commercialisation of biomarker tests for mental illnesses: advances and obstacles[J]. Trends Biotechnol, 2015, 33:712-723. doi:  10.1016/j.tibtech.2015.09.010
    [4] Ranstam J, Buyse M, George SL, et al. Fraud in medical research: an international survey of biostatisticians[J]. Controll Clini Trials, 2000, 21:415-427. doi:  10.1016/S0197-2456(00)00069-6
    [5] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[J]. arXiv preprint, 2014, 1412.6572. https://arxiv.org/abs/1412.6572
    [6] Erickson BJ, Korfiatis P, Akkus Z, et al. Machine learning for medical imaging[J]. Radiographics, 2017, 37:505-515. doi:  10.1148/rg.2017160130
    [7] Dallachiesa M, Ebaid A, Eldawy A, et al. NADEEF: a commodity data cleaning system[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ACM, 2013: 541-552. https://www.researchgate.net/publication/266653693_NADEEF_a_commodity_data_cleaning_system?_sg=DCFqDNvnXTKkHNIIbpCv_Ikp_w7YpyKUxtygQXlLca_k3v6yNoNnvNZ3sfyEzKJXiiYqNOa_HI9MUWUfM7xPkw
    [8] Rahm E, Do HH. Data cleaning: Problems and current approaches[J]. IEEE Data Eng, Bull, 2000, 23:3-13. http://ci.nii.ac.jp/naid/10018221721
    [9] Pauleen DJ, Rooney D, Intezari A. Big data, little wisdom: trouble brewing? Ethical implications for the information systems discipline[J]. Soc Epistemol, 2017, 31:400-416. doi:  10.1080/02691728.2016.1249436
    [10] McCaul ME, Wand GS. Detecting deception in our research participants: are your participants who you think they are?[J]. Alcoholism Clin Exp Res, 2018, 42:230-237. doi:  10.1111/acer.13556
    [11] Rohrer JM. Thinking clearly about correlations and causation: Graphical causal models for observational data[J]. Advances in Methods and Practices in Psychological Science (AMPPS), 2018, 1:27-42. doi:  10.1177/2515245917745629
    [12] Simon HA. Spurious correlation: A causal interpretation[J]. J Am Stat Assoc, 1954, 49:467-479. doi:  10.1007%2F978-94-010-9521-1_7
    [13] Wilson N, Mason K, Tobias M, et al. Interpreting "Google Flu Trends" data for pandemic H1N1 influenza: the New Zealand experience[J]. Euro Surveill, 2009, 14:19386. http://europepmc.org/abstract/MED/19941777
    [14] Lazer D, Kennedy R, King G, et al. Big data. The parable of Google Flu: traps in big data analysis[J]. Science, 2014, 343:1203-1205. doi:  10.1126/science.1248506
    [15] Butler D. When Google got flu wrong[J]. Nature, 2013, 494:155. doi:  10.1038/494155a
    [16] Taleb NN. The black swan: the impact of the highly improbable[M]. New York:Random house, 2007.
    [17] Siuly S, Zhang Y. Medical Big Data: Neurological Diseases Diagnosis Through Medical Data Analysis[J]. Data Science and Engineering (DSE), 2016, 1:54-64. doi:  10.1007/s41019-016-0011-3
    [18] Batrouni M, Bertaux A, Nicolle C. Scenario analysis, from BigData to black swan[J]. Comput Sci Rev, 2018, 28:131-139. doi:  10.1016/j.cosrev.2018.02.001
    [19] Doornik JA, Hendry DF. Statistical model selection with "Big Data "[J]. Cogent Economics & Finance, 2015, 3:1045216. doi:  10.1080/23322039.2015.1045216
    [20] Elsayed G, Shankar S, Cheung B, et al. Adversarial examples that fool both computer vision and time-limited humans[C]//Advances in Neural Information Processing Systems, 2018: 3910-3920. doi:  10.5555/3327144.3327306
    [21] Feinman R, Curtin RR, Shintre S, et al. Detecting Adversarial Samples from Artifacts[J]. arXiv preprint, 2017, 1703.00410. https://www.researchgate.net/publication/314153095_Detecting_Adversarial_Samples_from_Artifacts
    [22] Lipton ZC. The mythos of model interpretability[J]. Queue, 2018, 16:31-57. http://portal.acm.org/citation.cfm?id=3241340
    [23] Poursabzi-Sangdeh F, Goldstein DG, Hofman JM, et al. Manipulating and Measuring Model Interpretability[J]. arXiv preprint, 2018, 1802.07810. https://www.researchgate.net/publication/323355908_Manipulating_and_Measuring_Model_Interpretability
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(1)

    Article Metrics

    Article views (318) PDF downloads(46) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return