Browse > Article

A Hybrid Data Mining Technique Using Error Pattern Modeling  

Hur, Joon (SPSS Korea (주)데이터솔루션)
Kim, Jong-Woo (한양대학교 경영대학 경영학부)
Publication Information
Abstract
This paper presents a new hybrid data mining technique using error pattern modeling to improve classification accuracy when the data type of a target variable is binary. The proposed method increases prediction accuracy by combining two different supervised learning methods. That is, the algorithm extracts a subset of training cases that are predicted inconsistently by both methods, and models error patterns from the cases. Based on the error pattern model, the Predictions of two different methods are merged to generate final prediction. The proposed method has been tested using practical 10 data sets. The analysis results show that the performance of proposed method is superior to the existing methods such as artificial neural networks and decision tree induction.
Keywords
Supervised learning; Hybrid Model; Combined Model; Voting; Error Pattern Modeling;
Citations & Related Records
연도 인용수 순위
  • Reference
1 강문식, 이상용, '데이터 마이닝을 위한 경쟁학습모델과 BP알고리즘을 결합한 하이브리드 신경망', 정보기술과 데어터베이스 저널, 제9권 2호(2002), pp.1-16
2 이군희, '모형평가와 앙상블을 이용한 데이터 마이닝에 관한 연구', 서강경영논총, 제9권(1998), pp.293-306
3 이재식, 이진천, '입력자료 판별에 의한 데이터마이닝 성능개선', 한국지능정보학회학술대회, (2000), pp.293-303
4 Ali, K. and M. Pazzani, 'Error Reduction through Learning Multiple Descriptions,' Machine Learning, Vol.24, No.1(1996), pp. 105-112
5 Zhou, Z.-H., J. Wu and W. Tang, 'Ensembling Neural Networks: Many Could Be Better Than All,' Artificial Intelligence, Vol.137, No.1/2(2002), pp.239-263   DOI   ScienceOn
6 Carvalho, D.R. and A.A. Freitas 'Hybrid Decision Tree/Genetic Algorithm Method for Data Mining,' Information Sciences, Vol.163, No.1/3(2004), pp.13-35   DOI   ScienceOn
7 Gama, Joao Maguel Portela da, Combining Classification Algorithms, Departamento de Ciecia de Computadores Faculdade de Ciecias da Universidade do Porto, 1999
8 허명희, 'Clementine Stream Prototypes : Part 2', SPSS KoreaWhitepaper, (2004), pp.1-7
9 Brieman, L., 'Bagging Predictors,' Machine Learning, Vol.24, No.2(1996), pp.123-140
10 Kuncheva, L.I.C. Bezdek, and M.A. Shutton, 'On Combining Multiple Classifiers by Fuzzy Templates,' International Coriference on Artificial Neural Networks IEEE, (1998) pp.193-197
11 Li, R, and Z.-O. Wang, 'Mining Classification Rules Using Rough Sets and Neural Networks,' European Journal of Operational Research, Vol.157, No.2(2004), pp, 439-448   DOI   ScienceOn
12 Hsu, P.L., R Lai, C.C. Chui, and C.I. Hsu, 'The Hybrid of Association Rule Algorithms and Genetic Algorithm for Tree Induction : An Example of Predicting the Student Course Performance,' Expert Systems with Application, Vol.25, No.1(2003), pp.51-62   DOI   ScienceOn
13 김진성, '연관규칙과 퍼지 인공신경망에 기반한 하이브리드 데이터 마이닝 메커니즘에 대한 연구', 한국경영과학회/대한산업공학회 2003 춘계 공동학술대회 논문집, (2003), pp.226-228
14 Coenen, F.G., K.V. Swinnen and G. Wets 'The Improvement of Response Modeling : Combining Rule-induction and Case-based Reasoning,' Expert Systems with Application, Vol.18, No.4(2000), pp.307-313   DOI   ScienceOn
15 Freund, Y. and R.E. Schapire, 'Experiments with a New Boosting Algorithm,' Proceedings of 13th International Corference on Machine Learning, Morgan Kaufmann(1996), pp.148-156
16 Schapire, R, 'The Strength of Weak Learnerbility,' Machine Learning, Vol.5, No.2 (1990), pp.197-227
17 Schapire, R, Y. Freund, P. Bartlett, and W.S. Lee, 'Boosting the Margin : A New Explanation for theEffectiveness of Voting Methods,' Proceedings of the 14th International Conference on Machine Learning, Morgan Kaufmann, (1998), pp.32Z-330
18 이극노, 이홍철, '이동통신고객 분류를 위한 의사결정나무(C4.5)와 신경망 결합 알고리즘 연구', 한국지능정보시스템학회지, 제9권, 제1호(2003), pp.139-155
19 Conversano, C., R Siciliano and F. Mola, 'Generalized Additive Multi-mixture Model for Data Mining,' Computational Statistics & DataAnalysis, Vol.38, No.4(2002), pp.487-500   DOI   ScienceOn
20 Grzymala-Busse, J.W., 'A Comparison of Three Strategies to Rule Induction from Data with Numerical Attributes,' Electronic Notes in Theoretical Computer Science, Vol.82, No.4(2003), pp.1-9
21 Indurkhya, N. and S.M. Weiss, 'Estimating Performance Gains for Voted Decision Trees,' Intelligent Data Analysis, Vol.2, No.1/4(1998), pp.303-310   DOI   ScienceOn
22 Lin, F.Y. and S. McClean, 'A Data Mining Approach to the Prediction of Corporate Failure,' Knowledge-Based Systems, Vol. 14, No.3/4(2001), pp.189-195   DOI   ScienceOn
23 Michie D., D,J. Spiegelhalter, and C. Taylor, Machine Learning, Neural and Statistical Classification, Ellis Horwood, 1994
24 허명희, Clementine Ver. 8 User's Guide, SPSS Inc, 2003
25 Versace, M., R Bhatt, O. Hinds and M. Shiffer, 'Predicting the Exchange Traded Fund DIA with a Combination of Genetic Algorithm and Neural Networks,' Expert Systems with Application, Vol.27, No.3(2004), pp.417-425   DOI   ScienceOn
26 Quinlan, R, 'Bagging, Boosting and C4.5,' Procs. 13th American Association for Artificial Intelligence, AAAl Press, 1996
27 신현정, '앙상블 학습알고리즘의 일반화 성능비교 : OLA, Bagging, Boosting', 정보과학회논문지, 제97호(2000),pp.226-228
28 Sub, E.H, K.C. Noh and CK Sub, 'Customer List Segmentation Using the Combined Response Model,' Expert Systems with Application, Vol.17, No.2(1999), pp.89-97   DOI   ScienceOn
29 Hansen, L.K. and P. Salaman, 'Neural Networks Ensembles,' Transactions on Pattern Analysis and Machine Intelligence, Vol.12, No.10(1990), pp.993-1001   DOI   ScienceOn