Browse > Article
http://dx.doi.org/10.7465/jkdi.2013.24.5.989

Support vector machines for big data analysis  

Choi, Hosik (Department of Informational Statistics, Hoseo University)
Park, Hye Won (Department of Statistics, University of Seoul)
Park, Changyi (Department of Statistics, University of Seoul)
Publication Information
Journal of the Korean Data and Information Science Society / v.24, no.5, 2013 , pp. 989-998 More about this Journal
Abstract
We cannot analyze big data, which attracts recent attentions in industry and academy, by batch processing algorithms developed in data mining because big data, by definition, cannot be uploaded and processed in the memory of a single system. So an imminent issue is to develop various leaning algorithms so that they can be applied to big data. In this paper, we review various algorithms for support vector machines in the literature. Particularly, we introduce online type and parallel processing algorithms that are expected to be useful in big data classifications and compare the strengths, the weaknesses and the performances of those algorithms through simulations for linear classification.
Keywords
Batch processing; consensus; distributed processing; online algorithm;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Bordes, A., Bottou, L. and Gallinari, P. (2008). SGD-QN: Careful quasi-Newton stochastic gradient descent. Journal of Machine Learning Research, 10, 1737-1754.
2 Bottou, L. and Bousquet, O. (2008). The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, 20, 161-168.
3 Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2010). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trendsr in Machine Learning, 3, 1-122.   DOI
4 Cortes, C. and Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273-297.
5 Duchi, J. and Singer, Y. (2009). Efficient online and batch learning using forward-backward splitting. Journal of Machine Learning Research, 10, 2873-2898.
6 Fan, R.-E., Chen, P.-H. and Lin, C.-J. (2005). Working set selection using second order information for training SVM. Journal of Machine Learning Research, 6, 1889-1918.
7 Forero, P. A., Cano, A. and Giannakis, G. B. (2010). Consensus-based distributed support vector machines. Journal of Machine Learning Research, 11, 1663-1707.
8 Franc, V. and Sonnenburg, S. (2008). Optimized cutting plane algorithm for support vector machines. In Proceedings of the 25th International Conference on Machine Learning, ACM, 320-327.
9 Hsieh, C.-J., and Chang, K.-W., Lin, C.-J., Keerthi, S. S and Sundararajan, S. (2008). A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning, ACM, 408-415.
10 Park, C., Kim, Y., Kim, J., Song, J. and Choi, H. (2013). Data mining using R, 2nd Edition, Kyowoo Publisher, Seoul.
11 Park, D.-J., Yun, Y.-B. and Yoon, M. (2012). Prediction of bankruptcy data using machine learning techniques. Journal of the Korean Data & Information Science Society, 23, 569-577.   과학기술학회마을   DOI   ScienceOn
12 Park, H.-J. (2011). Online abnormal events detection with online support vector machine. Journal of the Korean Data & Information Science Society, 22, 197-206.   과학기술학회마을
13 Pi, S.-Y., Park, H.-J. and Ryu, K.-H. (2011) An analysis of satisfaction index on computer education of university using kernel machine. Journal of the Korean Data & Information Science Society, 22, 921-929.   과학기술학회마을
14 Platt, J. C. (1999). Fast training of support vector machines using sequential minimal optimization. In Advances in Kernal Methods - Support Vector Learning, MIT Press, 185-208.
15 Shalev-Shwartz, S., Singer, Y., Srebro, N. and Cotter, A. (2011). Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming B, 127, 3-30.   DOI
16 Smola, A. J., Vishwanathan, S. V. N. and Le, Q. V. (2007). Bundle methods for machine learning. In Advances in Neural Information Processing Systems, 20, MIT Press, 1377-1384.
17 Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society B, 67, 301-320.   DOI   ScienceOn