• Title/Summary/Keyword: Optimal Variable Selection

Search Result 95, Processing Time 0.021 seconds

Ensemble variable selection using genetic algorithm

  • Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.629-640
    • /
    • 2022
  • Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.

Optimal Variable Selection in a Thermal Error Model for Real Time Error Compensation (실시간 오차 보정을 위한 열변형 오차 모델의 최적 변수 선택)

  • Hwang, Seok-Hyun;Lee, Jin-Hyeon;Yang, Seung-Han
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.16 no.3 s.96
    • /
    • pp.215-221
    • /
    • 1999
  • The object of the thermal error compensation system in machine tools is improving the accuracy of a machine tool through real time error compensation. The accuracy of the machine tool totally depends on the accuracy of thermal error model. A thermal error model can be obtained by appropriate combination of temperature variables. The proposed method for optimal variable selection in the thermal error model is based on correlation grouping and successive regression analysis. Collinearity matter is improved with the correlation grouping and the judgment function which minimizes residual mean square is used. The linear model is more robust against measurement noises than an engineering judgement model that includes the higher order terms of variables. The proposed method is more effective for the applications in real time error compensation because of the reduction in computational time, sufficient model accuracy, and the robustness.

  • PDF

On an Optimal Bayesian Variable Selection Method for Generalized Logit Model

  • Kim, Hea-Jung;Lee, Ae Kuoung
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.617-631
    • /
    • 2000
  • This paper is concerned with suggesting a Bayesian method for variable selection in generalized logit model. It is based on Laplace-Metropolis algorithm intended to propose a simple method for estimating the marginal likelihood of the model. The algorithm then leads to a criterion for the selection of variables. The criterion is to find a subset of variables that maximizes the marginal likelihood of the model and it is seen to be a Bayes rule in a sense that it minimizes the risk of the variable selection under 0-1 loss function. Based upon two examples, the suggested method is illustrated and compared with existing frequentist methods.

  • PDF

Penalized rank regression estimator with the smoothly clipped absolute deviation function

  • Park, Jong-Tae;Jung, Kang-Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.673-683
    • /
    • 2017
  • The least absolute shrinkage and selection operator (LASSO) has been a popular regression estimator with simultaneous variable selection. However, LASSO does not have the oracle property and its robust version is needed in the case of heavy-tailed errors or serious outliers. We propose a robust penalized regression estimator which provide a simultaneous variable selection and estimator. It is based on the rank regression and the non-convex penalty function, the smoothly clipped absolute deviation (SCAD) function which has the oracle property. The proposed method combines the robustness of the rank regression and the oracle property of the SCAD penalty. We develop an efficient algorithm to compute the proposed estimator that includes a SCAD estimate based on the local linear approximation and the tuning parameter of the penalty function. Our estimate can be obtained by the least absolute deviation method. We used an optimal tuning parameter based on the Bayesian information criterion and the cross validation method. Numerical simulation shows that the proposed estimator is robust and effective to analyze contaminated data.

Variable selection in censored kernel regression

  • Choi, Kook-Lyeol;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.201-209
    • /
    • 2013
  • For censored regression, it is often the case that some input variables are not important, while some input variables are more important than others. We propose a novel algorithm for selecting such important input variables for censored kernel regression, which is based on the penalized regression with the weighted quadratic loss function for the censored data, where the weight is computed from the empirical survival function of the censoring variable. We employ the weighted version of ANOVA decomposition kernels to choose optimal subset of important input variables. Experimental results are then presented which indicate the performance of the proposed variable selection method.

Optimal Parameter Selection of Power System Stabilizer using Genetic Algorithm (유전 알고리즘을 이용한 전력시스템 안정화 장치의 최적 파라미터 선정)

  • Chung, Hyeng-Hwan;Wang, Yong-Peel;Chung, Dong-Il;Chung, Mun-Kyu
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.6
    • /
    • pp.683-691
    • /
    • 1999
  • In this paper, it is suggested that the selection method of optimal parameter of power system stabilizer(PSS) with robustness in low frequency oscillation for power system using Real Variable Elitism Genetc Algorithm(RVEGA). The optimal parameters were selected in the case of power system stabilizer with one lead compensator, and two lead compensator. Also, the frequency responses characteristic of PSS, the system eigenvalues criterion and the dynamic characteristic were considered in the normal load and the heavy load, which proved usefulness of RVEGA compare with Yu's compensator design theory.

  • PDF

Laplace-Metropolis Algorithm for Variable Selection in Multinomial Logit Model (Laplace-Metropolis알고리즘에 의한 다항로짓모형의 변수선택에 관한 연구)

  • 김혜중;이애경
    • Journal of Korean Society for Quality Management
    • /
    • v.29 no.1
    • /
    • pp.11-23
    • /
    • 2001
  • This paper is concerned with suggesting a Bayesian method for variable selection in multinomial logit model. It is based upon an optimal rule suggested by use of Bayes rule which minimizes a risk induced by selecting the multinomial logit model. The rule is to find a subset of variables that maximizes the marginal likelihood of the model. We also propose a Laplace-Metropolis algorithm intended to suggest a simple method forestimating the marginal likelihood of the model. Based upon two examples, artificial data and empirical data examples, the Bayesian method is illustrated and its efficiency is examined.

  • PDF

Variable selection in the kernel Cox regression

  • Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.795-801
    • /
    • 2011
  • In machine learning and statistics it is often the case that some variables are not important, while some variables are more important than others. We propose a novel algorithm for selecting such relevant variables in the kernel Cox regression. We employ the weighted version of ANOVA decomposition kernels to choose optimal subset of relevant variables in the kernel Cox regression. Experimental results are then presented which indicate the performance of the proposed method.

A Study on the Selection of Optimal Spot-weld Pitch for The Stainless Steel Car-body (스텐레스 차체 스폿용접부의 최적 피치 선정에 관한 연구)

  • 서승일;차병우
    • Proceedings of the KSR Conference
    • /
    • 1998.11a
    • /
    • pp.560-567
    • /
    • 1998
  • The pitch of spot-weld is a important variable in a view of both production cost and strength of car-body. Various renditions for the selection of pitches have been researched and especially in this paper the buckling analysis is carried out for the 2-sheet pannel structures. The optimal pitch is obtained by optimization program and FEM, which can enhance the buckling strength.

  • PDF

An Propagation Path Analysis for Optimal Position Selection of Microcell Base Station in the Mobile Communication System (이동통신 마이크로셀 기지국의 최적 위치 선정을 위한 전파경로 해석)

  • 노순국;박창균
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.7
    • /
    • pp.92-100
    • /
    • 1999
  • In the microcell mobile communication, we propose algorithms processing operational disposition to exactly analysis propagation environments from the base station to mobile stations. Algorithms are developed by the triangle analysis method can operate variable propagation paths and reflect numbers. For simulation, we suppose that mobile stations are located in the shadow region of the line of sight and the area of the non-line of sight sloping against the line of sight area at variable angles. By analyzing the results of simulation using proposed algorithms, we can be applied to the optimal position selection of the base station in the microcell mobile communication.

  • PDF