(그림 1) 디지털 스마트 헬스케어 패러다임
(그림 2) 머신 러닝 분석 흐름
(그림 3) (a) Breast cancer 데이터 속성과 (b) 가시화
(그림 4) (a)분류 행렬과 (b)유방암 데이터의 분석(XGBoost적용) 사례
(그림 5) 유방암 데이터의 모델 평가 지표
(그림 6) 유방암 데이터의 모델 평가 결과의 ROC 곡선
(그림 7) Breast Cancer 데이터[12]의 XGBoost 적용 분석 및 피쳐 중요도 ([29]로 구성)
(그림 8) 유방암 데이터의 XGBoost분석에 대하여SHAP에 의한 예측요인 설명([33]의 재구성)
<표 1> 데이터 가시화 도구
참고문헌
- 서경원 외, "스마트 헬스케어 의료기기 기술," 표준전략보고서, 식품의약품안전평가원, 2018. 8.
- 송영준, "4차 산업혁명과 디지털 헬스케어 정책," 주간기술동향, 2018. 2.
- 정성원, "Healthcare에서 빅데이터의 활용," 제 5회 임상연구 방법론 워크숍, 가톨릭의대의생명산업연구원, 서울, 2016. 11. 5, pp. 18-29.
- IBM, "Bigdata in Healthcare: Tapping New Insight to Save Lives," IBM Big Data & Analytics Hub, 2014. https://www.ibmbigdatahub.com/infographic/big-data-healthcare-tapping-new-insight-save-lives
- Wikipedia, "Machine Learning," https://en.wikipedia.org/wiki/Machine_learning
- 정일영, 구원모, "헬스케어생태계 구축을위한 데이터통합 방안," 동향과 이슈, 제46호, 2018. 1, pp. 1-38.
- MIT Critical Data, Secondary Analysis of Electronic Health Records, Springer International Publishing: NY, USA, 2016.
- G. Press, "Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says," Forbe, Mar. 23, 2016.
- S. Christa, V. Suma, and L. Maduri, "An Effective Data Preprocessing Technique for Improved Data Management in a Distributed Environment," ACCTHPCA, vol. 3, July 2012, pp. 25-29.
- SAS, "Data Visualization Techniques: From Basics to Big Data with SAS(R) Visual Analytics," SAS White Paper, 2018
- P. van der Laken, "Facet," Google, June 2017. https://github.com/PAIR-code/facets
- WIlliam H. Wolberg (physician), University of Wisconsin Hospitals. Madison, Wisconsin, USA, Breast Cancer Wisconsin (Original) Data Set, https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)
- Tutorials Point, "Seaborn," TutorialsPoint, 2017. https://www.tutorialspoint.com/seaborn/seaborn_tutorial.pdf
- A. Bilogur, "Missingno: A Missing Data Visualization Suite," J. Open Source Softw., Feb. 27, 2018, doi: 10.21105/joss.00547
- Continuum Analytics, "Blaze Documentation," 2018. https://blaze.readthedocs.io/en/latest/index.html
- G. Csardi and T. Nepusz, igraph Reference Manual, Harvard University: Cambridge, MA, USA, 2013.
- Wikipedia, "Feature Engineering," https://en.wikipedia.org/wiki/Feature_engineering
- A. Zheng, Evaluating Machine Learning Models, O'reilly: Sebastopol, CA, USA, 2015.
- Medcalc, "ROC Curve Analysis," https://www.medcalc.org/manual/roc-curves.php
- F.Y. Osisanwo et al., "Supervised Machine Learning Algorithms: Classification and Comparison," Int. J. Comput. Trends Technol., vol. 48, no. 3, June 2017, pp. 128-138. https://doi.org/10.14445/22312803/IJCTT-V48P126
- P. Harrington, Machine Learning in Action, Manning Publications Co.: Shelter Island, NY, USA, 2012, pp. 83-100.
- M. Namratha and T.R. Prajwala, "A Comprehensive Overview of Clustering Algorithms in Pattern Recognition," IOSR J. Comput. Eng., vol. 4, no. 6, 2012, pp. 23-30. https://doi.org/10.9790/0661-0462330
- L. Arnold et al., "An Introduction to Deep Learning," in Proc. Eur. Symp. Artif. Neural Netw., Bruges, Belgium, Apr. 27-29, 2011, pp. 477-488.
- Wikipedia, "Random Forest," https://en.wikipedia.org/wiki/Random_forest
- Wikipedia, "Boosting," https://en.wikipedia.org/wiki/Boosting_(machine_learning)
- R.E. Schapire, "The Boosting Approach to Machine Learning, An Overview," in MSRI Workshop on Nonlinear Estimation and Classification, Springer: Heidelberg, Germany, 2002, pp. 3-4.
- A. Natekin and A. Knoll, "Gradient Boosting Machines, a Tutorial," Front. Neurorobot., July 21, 2013, doi: 10.3389/fnbot.2013.00021.
- G. Biau, B. Cadre, and L. Rouviere, "Accelerated Gradient Boosting," arXiv:1803.02042, May 2018.
- J. Brownlee, "XGBoost with Python, Gradien Boosted Trees with XGBoost and Scikit-learn," Machine Learning Mastery, Sept. 19, 2016.
- G. Ke et al., "LGBM LightGBM: A Highly Efficient Gradient Boosting Decision Tree," Conf. Neural Inform. Process. Syst., Long Beach, CA, USA, 2017, pp. 1-9.
- A. Veronika, D.V. Ershov, and A. Guli, "CatBoost: Gradient Boosting with Categorical Features Support," Yandex, 2017. https://catboost.ai/
- M. Du, N. Liu, and X. Hu, "Techniques for Interpretable Machine Learning," arXiv:1808.00033, July 2018.
- M.T. Ribeiro, S. Singh, and C. Guestrin, "Why Should I Trust You?" Proc. ACM SIGKDD Int. Conf. Knowled. Discovery Data Mining, San Francisco, CA, USA, Aug. 13-17, 2016, pp. 1135-1144.
- S.M. Lundberg and S.-I. Lee, "A Unified Approach to Interpreting Model Predictions," Conf. Neural Inform. Process. Syst., Long Beach, CA, USA, 2017, pp. 1-10.
- A. Saabas, "treeinterpreter, 2015. https://github.com/andosa/treeinterpreter
- D. Foster, "xgboostExplainer," 2017. https://github.com/AppliedDataSciencePartners/xgboostExplainer