• 제목/요약/키워드: Selection Methods

검색결과 4,056건 처리시간 0.032초

Variable Selection in Sliced Inverse Regression Using Generalized Eigenvalue Problem with Penalties

  • Park, Chong-Sun
    • Communications for Statistical Applications and Methods
    • /
    • 제14권1호
    • /
    • pp.215-227
    • /
    • 2007
  • Variable selection algorithm for Sliced Inverse Regression using penalty function is proposed. We noted SIR models can be expressed as generalized eigenvalue decompositions and incorporated penalty functions on them. We found from small simulation that the HARD penalty function seems to be the best in preserving original directions compared with other well-known penalty functions. Also it turned out to be effective in forcing coefficient estimates zero for irrelevant predictors in regression analysis. Results from illustrative examples of simulated and real data sets will be provided.

Local Bandwidth Selection for Nonparametric Regression

  • Lee, Seong-Woo;Cha, Kyung-Joon
    • Communications for Statistical Applications and Methods
    • /
    • 제4권2호
    • /
    • pp.453-463
    • /
    • 1997
  • Nonparametric kernel regression has recently gained widespread acceptance as an attractive method for the nonparametric estimation of the mean function from noisy regression data. Also, the practical implementation of kernel method is enhanced by the availability of reliable rule for automatic selection of the bandwidth. In this article, we propose a method for automatic selection of the bandwidth that minimizes the asymptotic mean square error. Then, the estimated bandwidth by the proposed method is compared with the theoretical optimal bandwidth and a bandwidth by plug-in method. Simulation study is performed and shows satisfactory behavior of the proposed method.

  • PDF

Efficient Controlled Selection

  • Ryu, Jea-Bok;Lee, Seung-Joo
    • Communications for Statistical Applications and Methods
    • /
    • 제4권1호
    • /
    • pp.151-159
    • /
    • 1997
  • In sample surveys, we expect preferred samples that reduce the survey cost and increase the precision of estimators will be selected. Goodman and Kish (1950) introduced controlled selection as a method of sample selection that increases the probability of drawing preferred samples, while decreases the probability of drawing nonpreferred samples. In this paper, we obtain the controlled plans using the maximum entropy principle, and when the order of nonpreferred samples is considered, we propose the algorithm to obtain a controlled plan.

  • PDF

Bias Reduction in Split Variable Selection in C4.5

  • Shin, Sung-Chul;Jeong, Yeon-Joo;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.627-635
    • /
    • 2003
  • In this short communication we discuss the bias problem of C4.5 in split variable selection and suggest a method to reduce the variable selection bias among categorical predictor variables. A penalty proportional to the number of categories is applied to the splitting criterion gain of C4.5. The results of empirical comparisons show that the proposed modification of C4.5 reduces the size of classification trees.

Selection of Data-adaptive Polynomial Order in Local Polynomial Nonparametric Regression

  • Jo, Jae-Keun
    • Communications for Statistical Applications and Methods
    • /
    • 제4권1호
    • /
    • pp.177-183
    • /
    • 1997
  • A data-adaptive order selection procedure is proposed for local polynomial nonparametric regression. For each given polynomial order, bias and variance are estimated and the adaptive polynomial order that has the smallest estimated mean squared error is selected locally at each location point. To estimate mean squared error, empirical bias estimate of Ruppert (1995) and local polynomial variance estimate of Ruppert, Wand, Wand, Holst and Hossjer (1995) are used. Since the proposed method does not require fitting polynomial model of order higher than the model order, it is simpler than the order selection method proposed by Fan and Gijbels (1995b).

  • PDF

작업 일정계획문제 해결을 위한 유전알고리듬의 응용 (Application of Genetic Algorithms to a Job Scheduling Problem)

  • 김석준;이채영
    • 한국경영과학회지
    • /
    • 제17권3호
    • /
    • pp.1-12
    • /
    • 1992
  • Parallel Genetic Algorithms (GAs) are developed to solve a single machine n-job scheduling problem which is to minimize the sum of absolute deviations of completion times from a common due date. (0, 1) binary scheme is employed to represent the n-job schedule. Two selection methods, best individual selection and simple selection are examined. The effect of crossover operator, due date adjustment mutation and due date adjustment reordering are discussed. The performance of the parallel genetic algorithm is illustrated with some example problems.

  • PDF

Bandwidth Selection for Local Smoothing Jump Detector

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • 제16권6호
    • /
    • pp.1047-1054
    • /
    • 2009
  • Local smoothing jump detection procedure is a popular method for detecting jump locations and the performance of the jump detector heavily depends on the choice of the bandwidth. However, little work has been done on this issue. In this paper, we propose the bootstrap bandwidth selection method which can be used for any kernel-based or local polynomial-based jump detector. The proposed bandwidth selection method is fully data-adaptive and its performance is evaluated through a simulation study and a real data example.

Language- Independent Sentence Boundary Detection with Automatic Feature Selection

  • Lee, Do-Gil
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1297-1304
    • /
    • 2008
  • This paper proposes a machine learning approach for language-independent sentence boundary detection. The proposed method requires no heuristic rules and language-specific features, such as part-of-speech information, a list of abbreviations or proper names. With only the language-independent features, we perform experiments on not only an inflectional language but also an agglutinative language, having fairly different characteristics (in this paper, English and Korean, respectively). In addition, we obtain good performances in both languages. We have also experimented with the methods under a wide range of experimental conditions, especially for the selection of useful features.

  • PDF

Nonparametric Selection Procedures and Their Efficiency Comparisons

  • Sohn, Joong-K.;Shanti S.Gupta;Kim, Heon-Joo
    • Communications for Statistical Applications and Methods
    • /
    • 제1권1호
    • /
    • pp.41-51
    • /
    • 1994
  • We consider nonparametric procedures for the selection and ranking problems. Tukey's generalized lambda distribution is condidered as the distribution for the score function because the distribution can approximate many well-known contionuous distributions. Also we compare these procedures in terms of efficiency, defined by the ratio of a probability of a correct selection divided by the expected selected subset size.

  • PDF

Selection on milk production and conformation traits during the last two decades in Japan

  • Togashi, Kenji;Osawa, Takefumi;Adachi, Kazunori;Kurogi, Kazuhito;Tokunaka, Kota;Yasumori, Takanori;Takahashi, Tsutomu;Moribe, Kimihiro
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제32권2호
    • /
    • pp.183-191
    • /
    • 2019
  • Objective: The purpose of this study was to compare intended and actual yearly genetic gains for milk production and conformation traits and to investigate the simple selection criterion practiced among milk production and conformation traits during the last two decades in Japan. Learning how to utilize the information on intended and actual genetic gains during the last two decades into the genomic era is vital. Methods: Genetic superiority for each trait for four paths of selection (sires to breed bulls [SB], sires to breed cows [SC], dams to breed bulls [DB], and dams to breed cows [DC]) was estimated. Actual practiced simple selection criteria were investigated among milk production and conformation traits and relative emphasis on milk production and conformation traits was compared. Results: Selection differentials in milk production traits were greater than those of conformation traits in all four paths of selection. Realized yearly genetic gain was less than that intended for milk production traits. Actual annual genetic gain for conformation traits was equivalent to or greater than intended. Retrospective selection weights of milk production and conformation traits were 0.73:0.27 and 0.56:0.44 for intended and realized genetic gains, respectively. Conclusion: Selection was aimed more toward increasing genetic gain in milk production than toward conformation traits over the past two decades in Japan. In contrast, actual annual genetic gain for conformation traits was equivalent to or greater than intended. Balanced selection between milk production and conformation traits tended to be favored during actual selection. Each of four paths of selection (SB, SC, DB, and DC) has played an individual and important role. With shortening generation interval in the genomic era, a young sire arises before the completion of sire's daughters' milk production records. How to integrate these four paths of selection in the genomic era is vital.