DOI QR코드

DOI QR Code

Stable activation-based regression with localizing property

  • Received : 2021.01.23
  • Accepted : 2021.02.17
  • Published : 2021.05.31

Abstract

In this paper, we propose an adaptive regression method based on the single-layer neural network structure. We adopt a symmetric activation function as units of the structure. The activation function has a flexibility of its form with a parametrization and has a localizing property that is useful to improve the quality of estimation. In order to provide a spatially adaptive estimator, we regularize coefficients of the activation functions via ℓ1-penalization, through which the activation functions to be regarded as unnecessary are removed. In implementation, an efficient coordinate descent algorithm is applied for the proposed estimator. To obtain the stable results of estimation, we present an initialization scheme suited for our structure. Model selection procedure based on the Akaike information criterion is described. The simulation results show that the proposed estimator performs favorably in relation to existing methods and recovers the local structure of the underlying function based on the sample.

Keywords

Acknowledgement

The research of Ja-Yong Koo was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2018R1D1A1B07049972). The research of Jae-Hwan Jhong was supported by the NRF (NRF-2020R1G1A1A01100869).

References

  1. Abdollahi F, Talebi HA, and Patel RV (2006). Stable identification of nonlinear systems using neural networks: Theory and experiments, IEEE/ASME Transactions On Mechatronics, 11, 488-495. https://doi.org/10.1109/TMECH.2006.878527
  2. Agostinelli F, Hoffman M, Sadowski P, and Baldi P (2014). Learning activation functions to improve deep neural networks.
  3. Bani-Hani K and Ghaboussi J (1998). Nonlinear structural control using neural networks, Journal of Engineering Mechanics, 124, 319-327. https://doi.org/10.1061/(ASCE)0733-9399(1998)124:3(319)
  4. Donoho DL and Johnstone IM (1995). Adapting to unknown smoothness via wavelet shrinkage, Journal of the American Statistical Association, 90, 1200-1224. https://doi.org/10.1080/01621459.1995.10476626
  5. Fornberg B, Flyer N, Hovde S, and Piret C (2006). Localization properties of rbf expansion coefficients for cardinal interpolation. i. equispaced nodes, Advances in Computational Mathematics, 47, 5-20.
  6. Friedman JH (2001). Greedy function approximation: a gradient boosting machine, Annals of Statistics, 29, 1189-1232. https://doi.org/10.1214/aos/1013203451
  7. Glorot X and Bengio Y (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249-256.
  8. Jhong JH, Koo JY, and Lee SW (2017). Penalized b-spline estimator for regression functions using total variation penalty, Journal of Statistical Planning and Inference, 184, 77-93. https://doi.org/10.1016/j.jspi.2016.12.003
  9. Lepski OV, Mammen E, Spokoiny VG, et al. (1997). Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors, Annals of Statistics, 25, 929-947. https://doi.org/10.1214/aos/1069362731
  10. Luo Z and Wahba G (1997). Hybrid adaptive splines, Journal of the American Statistical Association, 92, 107-116. https://doi.org/10.1080/01621459.1997.10473607
  11. Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 807-814.
  12. Osborne MR, Presnell B, and Turlach BA (2000). On the lasso and its dual, Journal of Computational and Graphical statistics, 9, 319-337. https://doi.org/10.2307/1390657
  13. Silverman BW (1985). Some aspects of the spline smoothing approach to non-parametric regression curve fitting, Journal of the Royal Statistical Society: Series B, 47, 1-21.
  14. Spiriti S, Eubank R, Smith PW, and Young D (2013). Knot selection for least-squares and penalized splines, Journal of Statistical Computation and Simulation, 83, 1020-1036. https://doi.org/10.1080/00949655.2011.647317
  15. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B, 58, 267-288.
  16. Tsybakov AB (2008). Introduction to Nonparametric Estimation (1st ed), Springer Publishing Company, Incorporated, New York.
  17. Wasserman L (2006). All of Nonparametric Statistics (Springer Texts in Statistics), Springer-Verlag, Berlin.
  18. Wu Y, Wang H, Zhang B, and Du KL (2012). Using radial basis function networks for function approximation and classification, ISRN Applied Mathematics.
  19. Zhao Q and Griffin LD (2016). Suppressing the unusual: towards robust cnns using symmetric activation functions.