• Title/Summary/Keyword: Statistical Selection Method

Search Result 494, Processing Time 0.019 seconds

Simulation Optimization with Statistical Selection Method

  • Kim, Ju-Mi
    • Management Science and Financial Engineering
    • /
    • v.13 no.1
    • /
    • pp.1-24
    • /
    • 2007
  • I propose new combined randomized methods for global optimization problems. These methods are based on the Nested Partitions(NP) method, a useful method for simulation optimization which guarantees global optimal solution but has several shortcomings. To overcome these shortcomings I hired various statistical selection methods and combined with NP method. I first explain the NP method and statistical selection method. And after that I present a detail description of proposed new combined methods and show the results of an application. As well as, I show how these combined methods can be considered in case of computing budget limit problem.

A study on bandwith selection based on ASE for nonparametric density estimators

  • Kim, Tae-Yoon
    • Journal of the Korean Statistical Society
    • /
    • v.29 no.3
    • /
    • pp.307-313
    • /
    • 2000
  • Suppose we have a set of data X1, ···, Xn and employ kernel density estimator to estimate the marginal density of X. in this article bandwith selection problem for kernel density estimator is examined closely. In particular the Kullback-Leibler method (a bandwith selection methods based on average square error (ASE)) is considered.

  • PDF

Local Bandwidth Selection for Nonparametric Regression

  • Lee, Seong-Woo;Cha, Kyung-Joon
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.2
    • /
    • pp.453-463
    • /
    • 1997
  • Nonparametric kernel regression has recently gained widespread acceptance as an attractive method for the nonparametric estimation of the mean function from noisy regression data. Also, the practical implementation of kernel method is enhanced by the availability of reliable rule for automatic selection of the bandwidth. In this article, we propose a method for automatic selection of the bandwidth that minimizes the asymptotic mean square error. Then, the estimated bandwidth by the proposed method is compared with the theoretical optimal bandwidth and a bandwidth by plug-in method. Simulation study is performed and shows satisfactory behavior of the proposed method.

  • PDF

On the Bias of Bootstrap Model Selection Criteria

  • Kee-Won Lee;Songyong Sim
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.2
    • /
    • pp.195-203
    • /
    • 1996
  • A bootstrap method is used to correct the apparent downward bias of a naive plug-in bootstrap model selection criterion, which is shown to enjoy a high degree of accuracy. Comparison of bootstrap method with the asymptotic method is made through an illustrative example.

  • PDF

Discretization Method Based on Quantiles for Variable Selection Using Mutual Information

  • CHa, Woon-Ock;Huh, Moon-Yul
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.3
    • /
    • pp.659-672
    • /
    • 2005
  • This paper evaluates discretization of continuous variables to select relevant variables for supervised learning using mutual information. Three discretization methods, MDL, Histogram and 4-Intervals are considered. The process of discretization and variable subset selection is evaluated according to the classification accuracies with the 6 real data sets of UCI databases. Results show that 4-Interval discretization method based on quantiles, is robust and efficient for variable selection process. We also visually evaluate the appropriateness of the selected subset of variables.

Bandwidth Selection for Local Smoothing Jump Detector

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.6
    • /
    • pp.1047-1054
    • /
    • 2009
  • Local smoothing jump detection procedure is a popular method for detecting jump locations and the performance of the jump detector heavily depends on the choice of the bandwidth. However, little work has been done on this issue. In this paper, we propose the bootstrap bandwidth selection method which can be used for any kernel-based or local polynomial-based jump detector. The proposed bandwidth selection method is fully data-adaptive and its performance is evaluated through a simulation study and a real data example.

Ensemble variable selection using genetic algorithm

  • Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.629-640
    • /
    • 2022
  • Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.

Robust Variable Selection in Classification Tree

  • Jang Jeong Yee;Jeong Kwang Mo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.89-94
    • /
    • 2001
  • In this study we focus on variable selection in decision tree growing structure. Some of the splitting rules and variable selection algorithms are discussed. We propose a competitive variable selection method based on Kruskal-Wallis test, which is a nonparametric version of ANOVA F-test. Through a Monte Carlo study we note that CART has serious bias in variable selection towards categorical variables having many values, and also QUEST using F-test is not so powerful to select informative variables under heavy tailed distributions.

  • PDF

A Study on the Bias Reduction in Split Variable Selection in CART

  • Song, Hyo-Im;Song, Eun-Tae;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.553-562
    • /
    • 2004
  • In this short communication we discuss the bias problems of CART in split variable selection and suggest a method to reduce the variable selection bias. Penalties proportional to the number of categories or distinct values are applied to the splitting criteria of CART. The results of empirical comparisons show that the proposed modification of CART reduces the bias in variable selection.

A Novel Statistical Feature Selection Approach for Text Categorization

  • Fattah, Mohamed Abdel
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1397-1409
    • /
    • 2017
  • For text categorization task, distinctive text features selection is important due to feature space high dimensionality. It is important to decrease the feature space dimension to decrease processing time and increase accuracy. In the current study, for text categorization task, we introduce a novel statistical feature selection approach. This approach measures the term distribution in all collection documents, the term distribution in a certain category and the term distribution in a certain class relative to other classes. The proposed method results show its superiority over the traditional feature selection methods.