• Title/Summary/Keyword: Subset selection

Search Result 203, Processing Time 0.022 seconds

Subset Selection Procedures for Weibull Populations

  • Kim, U-Cheol;Choe, Ji-Hun;Kim, Dong-Gi
    • Journal of Korean Society for Quality Management
    • /
    • v.11 no.2
    • /
    • pp.18-24
    • /
    • 1983
  • In this paper, subset selection procedures are proposed for selecting the Weibull population with the smallest scale parameter out of k Weibull populations with a common shape parameter. The proposed procedures are based on the maximum likelihood estimators. The constants to implement the procedures are tabulated using Monte Carlo methods. Also, the results of a comparison study are given.

  • PDF

Detection and quantification of structural damage under ambient vibration environment

  • Yun, Gun Jin
    • Structural Engineering and Mechanics
    • /
    • v.42 no.3
    • /
    • pp.425-448
    • /
    • 2012
  • In this paper, a new damage detection and quantification method has been presented to perform detection and quantification of structural damage under ambient vibration loadings. To extract modal properties of the structural system under ambient excitation, natural excitation technique (NExT) and eigensystem realization algorithm (ERA) are employed. Sensitivity matrices of the dynamic residual force vector have been derived and used in the parameter subset selection method to identify multiple damaged locations. In the sequel, the steady state genetic algorithm (SSGA) is used to determine quantified levels of the identified damage by minimizing errors in the modal flexibility matrix. In this study, performance of the proposed damage detection and quantification methodology is evaluated using a finite element model of a truss structure with considerations of possible experimental errors and noises. A series of numerical examples with five different damage scenarios including a challengingly small damage level demonstrates that the proposed methodology can efficaciously detect and quantify damage under noisy ambient vibrations.

Nonparametric Selection Procedures and Their Efficiency Comparisons

  • Sohn, Joong-K.;Shanti S.Gupta;Kim, Heon-Joo
    • Communications for Statistical Applications and Methods
    • /
    • v.1 no.1
    • /
    • pp.41-51
    • /
    • 1994
  • We consider nonparametric procedures for the selection and ranking problems. Tukey's generalized lambda distribution is condidered as the distribution for the score function because the distribution can approximate many well-known contionuous distributions. Also we compare these procedures in terms of efficiency, defined by the ratio of a probability of a correct selection divided by the expected selected subset size.

  • PDF

A Bayes-P* Selection Procedure for Normal Means with Common Unknown Variance+ (분산이 미지인 정규모집단의 평균에 대한 베이즈-P* 선택방법에 관한 연구+)

  • 김우철;전종우;한경수
    • The Korean Journal of Applied Statistics
    • /
    • v.3 no.2
    • /
    • pp.79-89
    • /
    • 1990
  • For selecting a subset of k normal populations containing the one with the largest mean, a Bayes-$P^*$ selection procedure is considered when the common variance is unknown. Performance of the Bayes-$P^*$ selection procedure is compared with a well known classical procedure through a simulation study. Some frequentist's characteristics of Bayes-$P^*$ procedure are also studied.

  • PDF

Evaluating Variable Selection Techniques for Multivariate Linear Regression (다중선형회귀모형에서의 변수선택기법 평가)

  • Ryu, Nahyeon;Kim, Hyungseok;Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.5
    • /
    • pp.314-326
    • /
    • 2016
  • The purpose of variable selection techniques is to select a subset of relevant variables for a particular learning algorithm in order to improve the accuracy of prediction model and improve the efficiency of the model. We conduct an empirical analysis to evaluate and compare seven well-known variable selection techniques for multiple linear regression model, which is one of the most commonly used regression model in practice. The variable selection techniques we apply are forward selection, backward elimination, stepwise selection, genetic algorithm (GA), ridge regression, lasso (Least Absolute Shrinkage and Selection Operator) and elastic net. Based on the experiment with 49 regression data sets, it is found that GA resulted in the lowest error rates while lasso most significantly reduces the number of variables. In terms of computational efficiency, forward/backward elimination and lasso requires less time than the other techniques.

Development and implementation of statistical prediction procedure for field penetration index using ridge regression with best subset selection (최상부분집합이 고려된 능형회귀를 적용한 현장관입지수에 대한 통계적 예측기법 개발 및 적용)

  • Lee, Hang-Lo;Song, Ki-Il;Kim, Kyoung Yul
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.19 no.6
    • /
    • pp.857-870
    • /
    • 2017
  • The use of shield TBM is gradually increasing due to the urbanization of social infrastructures. Reliable estimation of advance rate is very important for accurate construction period and cost. For this purpose, it is required to develop the prediction model of advance rate that can consider the ground properties reasonably. Based on the database collected from field, statistical prediction procedure for field penetration index (FPI) was modularized in this study to calculate penetration rate of shield TBM. As output parameter, FPI was selected and various systems were included in this module such as, procedure of eliminating abnormal dataset, preprocessing of dataset and ridge regression with best subset selection. And it was finally validated by using field dataset.

Comparison of Feature Selection Processes for Image Retrieval Applications

  • Choi, Young-Mee;Choo, Moon-Won
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.12
    • /
    • pp.1544-1548
    • /
    • 2011
  • A process of choosing a subset of original features, so called feature selection, is considered as a crucial preprocessing step to image processing applications. There are already large pools of techniques developed for machine learning and data mining fields. In this paper, basically two methods, non-feature selection and feature selection, are investigated to compare their predictive effectiveness of classification. Color co-occurrence feature is used for defining image features. Standard Sequential Forward Selection algorithm are used for feature selection to identify relevant features and redundancy among relevant features. Four color spaces, RGB, YCbCr, HSV, and Gaussian space are considered for computing color co-occurrence features. Gray-level image feature is also considered for the performance comparison reasons. The experimental results are presented.

Prediction Model of Hypertension Using Sociodemographic Characteristics Based on Machine Learning (머신러닝 기반 사회인구학적 특징을 이용한 고혈압 예측모델)

  • Lee, Bum Ju
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.541-546
    • /
    • 2021
  • Recently, there is a trend of developing various identification and prediction models for hypertension using clinical information based on artificial intelligence and machine learning around the world. However, most previous studies on identification or prediction models of hypertension lack the consideration of the ideas of non-invasive and cost-effective variables, race, region, and countries. Therefore, the objective of this study is to present hypertension prediction model that is easily understood using only general and simple sociodemographic variables. Data used in this study was based on the Korea National Health and Nutrition Examination Survey (2018). In men, the model using the naive Bayes with the wrapper-based feature subset selection method showed the highest predictive performance (ROC = 0.790, kappa = 0.396). In women, the model using the naive Bayes with correlation-based feature subset selection method showed the strongest predictive performance (ROC = 0.850, kappa = 0.495). We found that the predictive performance of hypertension based on only sociodemographic variables was higher in women than in men. We think that our models based on machine leaning may be readily used in the field of public health and epidemiology in the future because of the use of simple sociodemographic characteristics.

A Study on the Applicability of 2-Poisson Model for Selecting Korean Subject Words (2-포아송 모형을 이용한 한글 주제어 선정에 관한 연구)

  • 정영미;최대식
    • Journal of the Korean Society for information Management
    • /
    • v.17 no.1
    • /
    • pp.129-148
    • /
    • 2000
  • Experiments were performed on three subsets of a Korean test collection in order to determine whether 2-Poisson model's Z value is a good measure for selecting subject words from a document to be indexed. It was found that subject word selection based on the Z value was effective for only one subset with short texts, i.e., the Science and Technology subset. Correlation analyses between 2-Poisson model's Z and TF.IDF weight for the three subsets showed that the correlation was relatively high for two test subsets with short texts, i.e., the Science and Technology subset and the Newspaper subset.

  • PDF

An Improved Sample Balanced Genetic Algorithm and Extreme Learning Machine for Accurate Alzheimer Disease Diagnosis

  • Sachnev, Vasily;Suresh, Sundaram
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.4
    • /
    • pp.118-127
    • /
    • 2016
  • An improved sample balanced genetic algorithm and Extreme Learning Machine (iSBGA-ELM) was designed for accurate diagnosis of Alzheimer disease (AD) and identification of biomarkers associated with AD in this paper. The proposed AD diagnosis approach uses a set of magnetic resonance imaging scans in Open Access Series of Imaging Studies (OASIS) public database to build an efficient AD classifier. The approach contains two steps: "voxels selection" based on an iSBGA and "AD classification" based on the ELM. In the first step, the proposed iSBGA searches for a robust subset of voxels with promising properties for further AD diagnosis. The robust subset of voxels chosen by iSBGA is then used to build an AD classifier based on the ELM. A robust subset of voxels keeps a high generalization performance of AD classification in various scenarios and highlights the importance of the chosen voxels for AD research. The AD classifier with maximum classification accuracy is created using an optimal subset of robust voxels. It represents the final AD diagnosis approach. Experiments with the proposed iSBGA-ELM using OASIS data set showed an average testing accuracy of 87%. Experiments clearly indicated the proposed iSBGA-ELM was efficient for AD diagnosis. It showed improvements over existing techniques.