Browse > Article
http://dx.doi.org/10.7465/jkdi.2013.24.6.1341

Comparison of data mining methods with daily lens data  

Seok, Kyungha (Department of Data Science, Inje University)
Lee, Taewoo (Department of Data Science, Inje University)
Publication Information
Journal of the Korean Data and Information Science Society / v.24, no.6, 2013 , pp. 1341-1348 More about this Journal
Abstract
To solve the classification problems, various data mining techniques have been applied to database marketing, credit scoring and market forecasting. In this paper, we compare various techniques such as bagging, boosting, LASSO, random forest and support vector machine with the daily lens transaction data. The classical techniques-decision tree, logistic regression-are used too. The experiment shows that the random forest has a little smaller misclassification rate and standard error than those of other methods. The performance of the SVM is good in the sense of misclassfication rate and bad in the sense of standard error. Taking the model interpretation and computing time into consideration, we conclude that the LASSO gives the best result.
Keywords
Bagging; boosting; data mining; decision tree; LASSO; logistic regression; support vector machine;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Kim, B., Cho, D., Lee, J., Lee, T., Hyun, J. and Kim, S. (2012). Comparison of two repurchase models using logistic regression and memory based reasoning. Journal of the Korean Data Analysis Society, 14, 1301 - 1314.
2 Opitz, D. and Maclin, R. A. (1999). Popular ensemble methods : An empirical study. Journal of Artificial Intelligence Research, 11, 169-198.
3 Park, H. (2011). Online abnormal events detection with online support vector machine. Journal of the Korean Data & Information Science Society, 22, 197-206.   과학기술학회마을
4 Pi, S. (2013). Self-diagnostic system for smartphone addictionusing multiclass SVM. Journal of the Korean Data & Information Science Society, 24, 13-22.   DOI   ScienceOn
5 Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society B, 58, 267-288.
6 Vapnik, V. N. (1996). The nature of statistical learning theory, Springer, New York.
7 Breiman, L. (1996). Bagging predictors. Machine Learning, 26, 123-140.
8 Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.   DOI   ScienceOn
9 Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and regression trees, Wadsworth, New York.
10 Freund, Y. and Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Proceedings of The Thirteenth International Conference on Machine Learning, 148-156.
11 Hastie, T., Tibshirani, R. and Friedman, J. (2009). The element of statistical learning: Data mining, inference, and prediction, New York, Spring Verlag.
12 Hwang, J,. Lee, J. and Kim, J. (2006). A comparison study of multiclass SVM methods in microarray data. Journal of the Korean Data & Information Science Society, 17, 311-324.   과학기술학회마을
13 Kim, A., Kim, J. and Kim, H. (2012). The guideline for choosing the right-size of tree for boosting algorithm. Journal of the Korean Data & Information Science Society, 23, 949-959.   과학기술학회마을   DOI   ScienceOn