• Title/Summary/Keyword: generalized lasso

Search Result 11, Processing Time 0.019 seconds

Moderately clipped LASSO for the high-dimensional generalized linear model

  • Lee, Sangin;Ku, Boncho;Kown, Sunghoon
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.4
    • /
    • pp.445-458
    • /
    • 2020
  • The least absolute shrinkage and selection operator (LASSO) is a popular method for a high-dimensional regression model. LASSO has high prediction accuracy; however, it also selects many irrelevant variables. In this paper, we consider the moderately clipped LASSO (MCL) for the high-dimensional generalized linear model which is a hybrid method of the LASSO and minimax concave penalty (MCP). The MCL preserves advantages of the LASSO and MCP since it shows high prediction accuracy and successfully selects relevant variables. We prove that the MCL achieves the oracle property under some regularity conditions, even when the number of parameters is larger than the sample size. An efficient algorithm is also provided. Various numerical studies confirm that the MCL can be a better alternative to other competitors.

A Study on Applying Shrinkage Method in Generalized Additive Model (일반화가법모형에서 축소방법의 적용연구)

  • Ki, Seung-Do;Kang, Kee-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.1
    • /
    • pp.207-218
    • /
    • 2010
  • Generalized additive model(GAM) is the statistical model that resolves most of the problems existing in the traditional linear regression model. However, overfitting phenomenon can be aroused without applying any method to reduce the number of independent variables. Therefore, variable selection methods in generalized additive model are needed. Recently, Lasso related methods are popular for variable selection in regression analysis. In this research, we consider Group Lasso and Elastic net models for variable selection in GAM and propose an algorithm for finding solutions. We compare the proposed methods via Monte Carlo simulation and applying auto insurance data in the fiscal year 2005. lt is shown that the proposed methods result in the better performance.

Spatial Clustering Method Via Generalized Lasso (Generalized Lasso를 이용한 공간 군집 기법)

  • Song, Eunjung;Choi, Hosik;Hwang, Seungsik;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.561-575
    • /
    • 2014
  • In this paper, we propose a penalized likelihood method to detect local spatial clusters associated with disease. The key computational algorithm is based on genlasso by Tibshirani and Taylor (2011). The proposed method has two main advantages over Kulldorff's method which is popoular to detect local spatial clusters. First, it is not needed to specify a proper cluster size a priori. Second, any type of covariate can be incorporated and, it is possible to find local spatial clusters adjusted for some demographic variables. We illustrate our proposed method using tuberculosis data from Seoul.

Time delay estimation algorithm using Elastic Net (Elastic Net를 이용한 시간 지연 추정 알고리즘)

  • Jun-Seok Lim;Keunwa Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.364-369
    • /
    • 2023
  • Time-delay estimation between two receivers is a technique that has been applied in a variety of fields, from underwater acoustics to room acoustics and robotics. There are two types of time delay estimation techniques: one that estimates the amount of time delay from the correlation between receivers, and the other that parametrically models the time delay between receivers and estimates the parameters by system recognition. The latter has the characteristic that only a small fraction of the system's parameters are directly related to the delay. This characteristic can be exploited to improve the accuracy of the estimation by methods such as Lasso regularization. However, in the case of Lasso regularization, the necessary information is lost. In this paper, we propose a method using Elastic Net that adds Ridge regularization to Lasso regularization to compensate for this. Comparing the proposed method with the conventional Generalized Cross Correlation (GCC) method and the method using Lasso regularization, we show that the estimation variance is very small even for white Gaussian signal sources and colored signal sources.

Variable selection in Poisson HGLMs using h-likelihoood

  • Ha, Il Do;Cho, Geon-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1513-1521
    • /
    • 2015
  • Selecting relevant variables for a statistical model is very important in regression analysis. Recently, variable selection methods using a penalized likelihood have been widely studied in various regression models. The main advantage of these methods is that they select important variables and estimate the regression coefficients of the covariates, simultaneously. In this paper, we propose a simple procedure based on a penalized h-likelihood (HL) for variable selection in Poisson hierarchical generalized linear models (HGLMs) for correlated count data. For this we consider three penalty functions (LASSO, SCAD and HL), and derive the corresponding variable-selection procedures. The proposed method is illustrated using a practical example.

Variable Selection in Frailty Models using FrailtyHL R Package: Breast Cancer Survival Data (frailtyHL 통계패키지를 이용한 프레일티 모형의 변수선택: 유방암 생존자료)

  • Kim, Bohyeon;Ha, Il Do;Noh, Maengseok;Na, Myung Hwan;Song, Ho-Chun;Kim, Jahae
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.5
    • /
    • pp.965-976
    • /
    • 2015
  • Determining relevant variables for a regression model is important in regression analysis. Recently, a variable selection methods using a penalized likelihood with various penalty functions (e.g. LASSO and SCAD) have been widely studied in simple statistical models such as linear models and generalized linear models. The advantage of these methods is that they select important variables and estimate regression coefficients, simultaneously; therefore, they delete insignificant variables by estimating their coefficients as zero. We study how to select proper variables based on penalized hierarchical likelihood (HL) in semi-parametric frailty models that allow three penalty functions, LASSO, SCAD and HL. For the variable selection we develop a new function in the "frailtyHL" R package. Our methods are illustrated with breast cancer survival data from the Medical Center at Chonnam National University in Korea. We compare the results from three variable-selection methods and discuss advantages and disadvantages.

A Study for the Drivers of Movie Box-office Performance (영화흥행 영향요인 선택에 관한 연구)

  • Kim, Yon Hyong;Hong, Jeong Han
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.3
    • /
    • pp.441-452
    • /
    • 2013
  • This study analyzed the relationship between key film and a box office record success factors based on movies released in the first quarter of 2013 in Korea. An over-fitting problem can happen if there are too many explanatory variables inserted to regression model; in addition, there is a risk that the estimator is instable when there is multi-collinearity among the explanatory variables. For this reason, optimal variable selection based on high explanatory variables in box-office performance is of importance. Among the numerous ways to select variables, LASSO estimation applied by a generalized linear model has the smallest prediction error that can efficiently and quickly find variables with the highest explanatory power to box-office performance in order.

Efficient Neural Network for Downscaling climate scenarios

  • Moradi, Masha;Lee, Taesam
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.157-157
    • /
    • 2018
  • A reliable and accurate downscaling model which can provide climate change information, obtained from global climate models (GCMs), at finer resolution has been always of great interest to researchers. In order to achieve this model, linear methods widely have been studied in the past decades. However, nonlinear methods also can be potentially beneficial to solve downscaling problem. Therefore, this study explored the applicability of some nonlinear machine learning techniques such as neural network (NN), extreme learning machine (ELM), and ELM autoencoder (ELM-AE) as well as a linear method, least absolute shrinkage and selection operator (LASSO), to build a reliable temperature downscaling model. ELM is an efficient learning algorithm for generalized single layer feed-forward neural networks (SLFNs). Its excellent training speed and good generalization capability make ELM an efficient solution for SLFNs compared to traditional time-consuming learning methods like back propagation (BP). However, due to its shallow architecture, ELM may not capture all of nonlinear relationships between input features. To address this issue, ELM-AE was tested in the current study for temperature downscaling.

  • PDF

Case study: Selection of the weather variables influencing the number of pneumonia patients in Daegu Fatima Hospital (사례연구: 대구 파티마 병원 폐렴 입원 환자 수에 영향을 미치는 날씨 변수 선택)

  • Choi, Sohyun;Lee, Hag Lae;Park, Chungun;Lee, Kyeong Eun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.131-142
    • /
    • 2017
  • The number of hospital admissions for pneumonia tends to increase annually and even more, pneumonia, the fifth leading causes of death among elder adults, is one of top diseases in terms of hospitalization rate. Although mainly bacteria and viruses cause pneumonia, the weather is also related to the occurrence of pneumonia. The candidate weather variables are humidity, amount of sunshine, diurnal temperature range, daily mean temperatures and density of particles. Due to the delayed occurrence of pneumonia, lagged weather variables are also considered. Additionally, year effects, holiday effects and seasonal effects are considered. We select the related variables that influence the occurrence of pneumonia using penalized generalized linear models.

Modelling the deflection of reinforced concrete beams using the improved artificial neural network by imperialist competitive optimization

  • Li, Ning;Asteris, Panagiotis G.;Tran, Trung-Tin;Pradhan, Biswajeet;Nguyen, Hoang
    • Steel and Composite Structures
    • /
    • v.42 no.6
    • /
    • pp.733-745
    • /
    • 2022
  • This study proposed a robust artificial intelligence (AI) model based on the social behaviour of the imperialist competitive algorithm (ICA) and artificial neural network (ANN) for modelling the deflection of reinforced concrete beams, abbreviated as ICA-ANN model. Accordingly, the ICA was used to adjust and optimize the parameters of an ANN model (i.e., weights and biases) aiming to improve the accuracy of the ANN model in modelling the deflection reinforced concrete beams. A total of 120 experimental datasets of reinforced concrete beams were employed for this aim. Therein, applied load, tensile reinforcement strength and the reinforcement percentage were used to simulate the deflection of reinforced concrete beams. Besides, five other AI models, such as ANN, SVM (support vector machine), GLMNET (lasso and elastic-net regularized generalized linear models), CART (classification and regression tree) and KNN (k-nearest neighbours), were also used for the comprehensive assessment of the proposed model (i.e., ICA-ANN). The comparison of the derived results with the experimental findings demonstrates that among the developed models the ICA-ANN model is that can approximate the reinforced concrete beams deflection in a more reliable and robust manner.