Browse > Article
http://dx.doi.org/10.29220/CSAM.2021.28.3.281

Stable activation-based regression with localizing property  

Shin, Jae-Kyung (Department of Statistics, Korea University)
Jhong, Jae-Hwan (Department of Information Statistics, ChungBuk National University)
Koo, Ja-Yong (Department of Statistics, Korea University)
Publication Information
Communications for Statistical Applications and Methods / v.28, no.3, 2021 , pp. 281-294 More about this Journal
Abstract
In this paper, we propose an adaptive regression method based on the single-layer neural network structure. We adopt a symmetric activation function as units of the structure. The activation function has a flexibility of its form with a parametrization and has a localizing property that is useful to improve the quality of estimation. In order to provide a spatially adaptive estimator, we regularize coefficients of the activation functions via ℓ1-penalization, through which the activation functions to be regarded as unnecessary are removed. In implementation, an efficient coordinate descent algorithm is applied for the proposed estimator. To obtain the stable results of estimation, we present an initialization scheme suited for our structure. Model selection procedure based on the Akaike information criterion is described. The simulation results show that the proposed estimator performs favorably in relation to existing methods and recovers the local structure of the underlying function based on the sample.
Keywords
nonparametric regression; penalized least squares; coordinate descent algorithm; adaptive estimation; symmetric activation function;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Fornberg B, Flyer N, Hovde S, and Piret C (2006). Localization properties of rbf expansion coefficients for cardinal interpolation. i. equispaced nodes, Advances in Computational Mathematics, 47, 5-20.
2 Osborne MR, Presnell B, and Turlach BA (2000). On the lasso and its dual, Journal of Computational and Graphical statistics, 9, 319-337.   DOI
3 Silverman BW (1985). Some aspects of the spline smoothing approach to non-parametric regression curve fitting, Journal of the Royal Statistical Society: Series B, 47, 1-21.
4 Donoho DL and Johnstone IM (1995). Adapting to unknown smoothness via wavelet shrinkage, Journal of the American Statistical Association, 90, 1200-1224.   DOI
5 Luo Z and Wahba G (1997). Hybrid adaptive splines, Journal of the American Statistical Association, 92, 107-116.   DOI
6 Wasserman L (2006). All of Nonparametric Statistics (Springer Texts in Statistics), Springer-Verlag, Berlin.
7 Spiriti S, Eubank R, Smith PW, and Young D (2013). Knot selection for least-squares and penalized splines, Journal of Statistical Computation and Simulation, 83, 1020-1036.   DOI
8 Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B, 58, 267-288.
9 Tsybakov AB (2008). Introduction to Nonparametric Estimation (1st ed), Springer Publishing Company, Incorporated, New York.
10 Wu Y, Wang H, Zhang B, and Du KL (2012). Using radial basis function networks for function approximation and classification, ISRN Applied Mathematics.
11 Zhao Q and Griffin LD (2016). Suppressing the unusual: towards robust cnns using symmetric activation functions.
12 Jhong JH, Koo JY, and Lee SW (2017). Penalized b-spline estimator for regression functions using total variation penalty, Journal of Statistical Planning and Inference, 184, 77-93.   DOI
13 Abdollahi F, Talebi HA, and Patel RV (2006). Stable identification of nonlinear systems using neural networks: Theory and experiments, IEEE/ASME Transactions On Mechatronics, 11, 488-495.   DOI
14 Agostinelli F, Hoffman M, Sadowski P, and Baldi P (2014). Learning activation functions to improve deep neural networks.
15 Glorot X and Bengio Y (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249-256.
16 Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 807-814.
17 Bani-Hani K and Ghaboussi J (1998). Nonlinear structural control using neural networks, Journal of Engineering Mechanics, 124, 319-327.   DOI
18 Friedman JH (2001). Greedy function approximation: a gradient boosting machine, Annals of Statistics, 29, 1189-1232.   DOI
19 Lepski OV, Mammen E, Spokoiny VG, et al. (1997). Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors, Annals of Statistics, 25, 929-947.   DOI