[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7465/jkdi.2017.28.6.1245

Adaptive stochastic gradient method under two mixing heterogenous models

Moon, Sang Jun (Department of Statistics, University of Seoul)
Jeon, Jong-June (Department of Statistics, University of Seoul)

Publication Information

Journal of the Korean Data and Information Science Society / v.28, no.6, 2017 , pp. 1245-1255 More about this Journal

Abstract

The online learning is a process of obtaining the solution for a given objective function where the data is accumulated in real time or in batch units. The stochastic gradient descent method is one of the most widely used for the online learning. This method is not only easy to implement, but also has good properties of the solution under the assumption that the generating model of data is homogeneous. However, the stochastic gradient method could severely mislead the online-learning when the homogeneity is actually violated. We assume that there are two heterogeneous generating models in the observation, and propose the a new stochastic gradient method that mitigate the problem of the heterogeneous models. We introduce a robust mini-batch optimization method using statistical tests and investigate the convergence radius of the solution in the proposed method. Moreover, the theoretical results are confirmed by the numerical simulations.

Keywords

Mini-batch; on-line learning; robustness; stochastic gradient descent method;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	Duchi, J., Hazan, E. and Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121-2159.
2	Boyd, S. and Lieven, V. (2004). Convex optimization. Cambridge university press, 466-468.
3	Bottou, L. (2010). Large-Scale Machine Learning with Stochastic Gradient Descent. Proceedings of COMPSTAT' 2010, 177-186.
4	Dekel, O., Gilad-Bachrach, R., Shamir, O. and Xiao, L. (2012). Optimal distributed online prediction using mini-batches. Journal of Machine Learning Research, 13, 165-202.
5	Hwang, C. and Shim, J. (2016). Deep LS-SVM for regression. Journal of the Korean Data & Information Science Society, 27, 827-833. DOI
6	Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
7	Konecny, J., Liu, J., Richtarik, P. and Takac, M. (2016). Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE Journal of Selected Topics in Signal Processing, 10, 242-255. DOI
8	LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep learning. Nature, 521, 436-444. DOI
9	Lee, W. and Chun, H. (2016). A deep learning analysis of the Chinese Yuans volatility in the onshore and offshore markets. Journal of the Korean Data & Information Science Society, 27, 327-335. DOI
10	Li, M., Zhang, T., Chen, Y. and Smola, A. J. (2014). Efficient mini-batch training for stochastic optimization. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.
11	Zeiler, M. D. (2012). ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701
12	Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533-538. DOI
13	Shapiro, A. and Wardi, Y. (1996). Convergence analysis of gradient descent stochastic algorithms. Journal of optimization theory and applications, 91, 439-454. DOI
14	Tieleman, T. and Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning 4.2, 26-31
15	Yamanishi, K., Takeuchi, J. I., Williams, G. and Milne, P. (2004). On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Mining and Knowledge Discovery, 8, 275-300. DOI

KSCI

Adaptive stochastic gradient method under two mixing heterogenous models 두 이종 혼합 모형에서의 수정된 경사 하강법

Adaptive stochastic gradient method under two mixing heterogenous models