Ensemble Classification Method for Efficient Medical Diagnostic

효율적인 의료진단을 위한 앙상블 분류 기법

  • 정용규 (을지대학교 의료IT마케팅학과) ;
  • 허고은 (을지대학교 의료산업학부 의료전산학전공)
  • Received : 2010.06.05
  • Published : 2010.06.30

Abstract

The purpose of medical data mining for efficient algorithms and techniques throughout the various diseases is to increase the reliability of estimates to classify. Previous studies, an algorithm based on a single model, and even the existence of the model to better predict the classification accuracy of multi-model ensemble-based research techniques are being applied. In this paper, the higher the medical data to predict the reliability of the existing scope of the ensemble technique applied to the I-ENSEMBLE offers. Data for the diagnosis of hypothyroidism is the result of applying the experimental technique, a representative ensemble Bagging, Boosting, Stacking technique significantly improved accuracy compared to all existing, respectively. In addition, compared to traditional single-model techniques and ensemble techniques Multi modeling when applied to represent the effects were more pronounced.

의료 데이터 마이닝의 목적은 효율적인 알고리즘 및 기법을 통하여 각종 질병을 예측 분류하고 신뢰도를 높이는데 있다. 기존의 연구로 단일모델을 기반으로 하는 알고리즘이 존재하며 나아가 모델의 더 좋은 예측과 분류 정확도를 위하여 다중모델을 기반으로 하는 앙상블 기법을 적용한 연구도 진행되고 있다. 본 논문에서는 의료데이터의 보다 높은 예측의 신뢰도를 위하여 기존의 앙상블 기법에 사분위간 범위를 적용한 I-ENSEMBLE을 제안한다. 갑상선 기능 저하증 진단을 위한 데이터를 통해 실험 적용한 결과 앙상블의 대표적인 기법인 Bagging, Boosting, Stacking기법 모두 기존에 비해 현저하게 향상된 정확도를 나타내었다. 또한 기존 단일모델 기법과 비교하여 다중모델인 앙상블 기법에 사분위간 범위를 적용했을 때 더 뚜렷한 효과를 나타냄을 확인하였다.

Keywords

References

  1. P. Antal, et al., "Using literature and data to learn Bayesian networks as clinical models of ovarian tumors," Artificial Intelligence in Medicine, Vol 30, pp.257-281, 2004. https://doi.org/10.1016/j.artmed.2003.11.007
  2. Go-Eun Heo, Yong-Gyu Jung, Efficient Learning of Bayesian Networks using Entropy, The institute Of Webcasting, Internet And Telecommunication, Vol 9, No 3, pp.31-36, 2009. 6
  3. Carvalho, D. R. and A. A. Freitas, Hybrid Decision Tree/Genetic Algorithm Method for Data Mining, Information Sciences, Vol 163, No 1-3, pp.13-35, 2004. https://doi.org/10.1016/j.ins.2003.03.013
  4. Bauer, Eric and Ron Kohavi, An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting and Variants, Machine Learning Vol 36, pp.105-142, 1999. https://doi.org/10.1023/A:1007515423169
  5. Zhang, Z. and C. Zhang, Agent-Based Hybrid Intelligent Systems, LNAI 2938, pp.127-142, 2004.
  6. Joon Hur, Jong Woo Kim, Characteristics on Inconsistency Pattern Modeling as Hybrid Data Mining Techniques, Journal Of Information Technology Applications & Management, Vol 15, No 1, pp.225-242, 2008.3.
  7. Conversano, C., R. Siciliano, and F. Mola, "Generalized Additive Multi-mixture Model for Data Mining", Computational Statistics and Data Analysis, Vol 38, No 4, pp.487-500, 2002. https://doi.org/10.1016/S0167-9473(01)00074-3
  8. Breiman, L., Bagging Predictors, Machine Learning, Vol 24, pp.123-140, 1996.
  9. Brieman, L., "Random Forests", Machine Learning, Vol 45, No 1, pp.5-32, 2001. https://doi.org/10.1023/A:1010933404324
  10. Breiman, L., Stacked Regressions, Machine Learning, Vol 24, pp.49-64, 1996.
  11. lan H.Witten and Eibe Frank, Data Mining, Addison Wesley, pp.315-333, 2005
  12. Kittler, J. et al., On combining classifiers, IEEE transactions on Pattern Analysis and Machine Intelligence, Vol 20, No 3, pp.226-239, 1998. https://doi.org/10.1109/34.667881
  13. Schaphire, Robert E., Theoretical views of boosting, In Computational Learning Theory: 4th European Conference EuroCOLT '99, 1999.
  14. Pang-Ning Tan & Michael Steinbach & Vipin Kumar, Introduction to Data Mining, ELSEVIER, pp.270-287, 2006
  15. Wolpert, L., Stacked Generalization, Neural Networks, Vol 5, No 2, pp. 241-259, 1992. https://doi.org/10.1016/S0893-6080(05)80023-1
  16. Wolpert, D. and Macready, W., Combining Stacking with Bagging to Improve a Learning Algorithm, Technical Report, Santa Fe: Santa Fe Institute, 1996.
  17. Zhang, Z. and C. Zhang, "Agent-Based Hybrid Intelligent Systems", LNAI 2938, pp.127-142, 2004.