[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2021.28.3.281

Stable activation-based regression with localizing property

Shin, Jae-Kyung (Department of Statistics, Korea University)
Jhong, Jae-Hwan (Department of Information Statistics, ChungBuk National University)
Koo, Ja-Yong (Department of Statistics, Korea University)

Publication Information

Communications for Statistical Applications and Methods / v.28, no.3, 2021 , pp. 281-294 More about this Journal

Abstract

In this paper, we propose an adaptive regression method based on the single-layer neural network structure. We adopt a symmetric activation function as units of the structure. The activation function has a flexibility of its form with a parametrization and has a localizing property that is useful to improve the quality of estimation. In order to provide a spatially adaptive estimator, we regularize coefficients of the activation functions via ℓ₁-penalization, through which the activation functions to be regarded as unnecessary are removed. In implementation, an efficient coordinate descent algorithm is applied for the proposed estimator. To obtain the stable results of estimation, we present an initialization scheme suited for our structure. Model selection procedure based on the Akaike information criterion is described. The simulation results show that the proposed estimator performs favorably in relation to existing methods and recovers the local structure of the underlying function based on the sample.

Keywords

nonparametric regression; penalized least squares; coordinate descent algorithm; adaptive estimation; symmetric activation function;

Citations & Related Records

Reference

1	Fornberg B, Flyer N, Hovde S, and Piret C (2006). Localization properties of rbf expansion coefficients for cardinal interpolation. i. equispaced nodes, Advances in Computational Mathematics, 47, 5-20.
2	Osborne MR, Presnell B, and Turlach BA (2000). On the lasso and its dual, Journal of Computational and Graphical statistics, 9, 319-337. DOI
3	Silverman BW (1985). Some aspects of the spline smoothing approach to non-parametric regression curve fitting, Journal of the Royal Statistical Society: Series B, 47, 1-21.
4	Donoho DL and Johnstone IM (1995). Adapting to unknown smoothness via wavelet shrinkage, Journal of the American Statistical Association, 90, 1200-1224. DOI
5	Luo Z and Wahba G (1997). Hybrid adaptive splines, Journal of the American Statistical Association, 92, 107-116. DOI
6	Wasserman L (2006). All of Nonparametric Statistics (Springer Texts in Statistics), Springer-Verlag, Berlin.
7	Spiriti S, Eubank R, Smith PW, and Young D (2013). Knot selection for least-squares and penalized splines, Journal of Statistical Computation and Simulation, 83, 1020-1036. DOI
8	Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B, 58, 267-288.
9	Tsybakov AB (2008). Introduction to Nonparametric Estimation (1st ed), Springer Publishing Company, Incorporated, New York.
10	Wu Y, Wang H, Zhang B, and Du KL (2012). Using radial basis function networks for function approximation and classification, ISRN Applied Mathematics.
11	Zhao Q and Griffin LD (2016). Suppressing the unusual: towards robust cnns using symmetric activation functions.
12	Jhong JH, Koo JY, and Lee SW (2017). Penalized b-spline estimator for regression functions using total variation penalty, Journal of Statistical Planning and Inference, 184, 77-93. DOI
13	Abdollahi F, Talebi HA, and Patel RV (2006). Stable identification of nonlinear systems using neural networks: Theory and experiments, IEEE/ASME Transactions On Mechatronics, 11, 488-495. DOI
14	Agostinelli F, Hoffman M, Sadowski P, and Baldi P (2014). Learning activation functions to improve deep neural networks.
15	Glorot X and Bengio Y (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249-256.
16	Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 807-814.
17	Bani-Hani K and Ghaboussi J (1998). Nonlinear structural control using neural networks, Journal of Engineering Mechanics, 124, 319-327. DOI
18	Friedman JH (2001). Greedy function approximation: a gradient boosting machine, Annals of Statistics, 29, 1189-1232. DOI
19	Lepski OV, Mammen E, Spokoiny VG, et al. (1997). Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors, Annals of Statistics, 25, 929-947. DOI