• 제목/요약/키워드: Stepwise selection

검색결과 156건 처리시간 0.024초

Artificial Intelligence-Based Stepwise Selection of Bearings

  • Seo, Tae-Sul;Soonhung Han
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2001년도 The Pacific Aisan Confrence On Intelligent Systems 2001
    • /
    • pp.219-223
    • /
    • 2001
  • Within a mechanical system such as an automotive the number of standard machine parts is increasing, so that the parts selection becomes more important than ever before. Selection of appropriate bearings in the preliminary design phase of a machine is also important. In this paper, three decision-making approaches are compared to find out a model that is appropriate to bearing selection problem. An artificial neural network, which is trained with real design cases, is used to select a bearing mechanism at the first step. Then, the subtype of the bearing is selected by the weighting factor method. Finally, types of peripherals such as lubrication methods are determined by a rule-based expert system.

  • PDF

부분선형모형에서 LARS를 이용한 변수선택 (Variable selection in partial linear regression using the least angle regression)

  • 서한손;윤민;이학배
    • 응용통계연구
    • /
    • 제34권6호
    • /
    • pp.937-944
    • /
    • 2021
  • 본 연구는 부분선형모형에서 변수선택의 문제를 다룬다. 부분선형모형은 평활화모수 추정과 같은 비모수 추정과 선형설명변수에 대한 추정의 문제를 함께 포함하고 있어 변수선택이 쉽지 않다. 본 연구에서는 빠른 전진선택법인 LARS 를 이용한 변수선택법을 제시한다. 제안된 방법은 LARS에 의하여 선별된 변수들에 대하여 t-검정, 가능한 모든 회귀모형 비교 또는 단계별 선택법을 적용한다. 제안된 방법들의 효율성을 비교하기 위하여 실제데이터에 적용한 예제와 모의실험 결과가 제시된다.

MDA에서 판별변수 선택을 위한 베이즈 기준 (A Bayes Criterion for Selecting Variables in MDA)

  • 김혜중;유희경
    • 응용통계연구
    • /
    • 제11권2호
    • /
    • pp.435-449
    • /
    • 1998
  • 본 연구는 다중판별분석(MDA)에서 필요한 변수선택기준을 베이즈접근법으로 제안하였다. 이 베이즈판별변수 선택기준은 여러 정규모집단분포의 평균벡터에 대한 동질성 검정에 필요한 디폴터형태의 베이즈요인을 객관적 베이즈방법으로 유도하여 설정하였다. 디폴트베이즈요인(default Bayes factor)은 Spiegelhalter와 Smith (1982)가 계발한 가상적트레이닝표본법(imaginary training sample method)을 사용하여서 도출하였다. 또한 제안된 베이즈판별변수선택 기준이 지닌 분포의 성질을 이용하여, 추가 판별변수(또는 변수군)가 MDA에 기여하는 부가적인 판별력에 대한 검정법 및 추가판별변수(또는 변수군)의 선택 기준에 대해서도 논하였다. 본 연구에서 새로이 얻은 변수선택기준은 최적부분집합선택법(optimal subset selection method)뿐 아니라 각 단계적방법(stepwise method)의 변수선택기준으로 사용될 수 있으며, 두 그룹 판별분석에도 사용이 가능하다는 점에서 표본이론에 의해 여러 형태로 개발된 기존의 판별변수 선택 기준들을 하나로 통합시킬 수 있는 기능을 지니고 있다. 모의실험을 실시하여 최적 부분집합선택법과 단계적방법하에서 제안된 판별변수선택 기준이 가진 효용성을 평가하였다.

  • PDF

Feature Selection for Multi-Class Support Vector Machines Using an Impurity Measure of Classification Trees: An Application to the Credit Rating of S&P 500 Companies

  • Hong, Tae-Ho;Park, Ji-Young
    • Asia pacific journal of information systems
    • /
    • 제21권2호
    • /
    • pp.43-58
    • /
    • 2011
  • Support vector machines (SVMs), a machine learning technique, has been applied to not only binary classification problems such as bankruptcy prediction but also multi-class problems such as corporate credit ratings. However, in general, the performance of SVMs can be easily worse than the best alternative model to SVMs according to the selection of predictors, even though SVMs has the distinguishing feature of successfully classifying and predicting in a lot of dichotomous or multi-class problems. For overcoming the weakness of SVMs, this study has proposed an approach for selecting features for multi-class SVMs that utilize the impurity measures of classification trees. For the selection of the input features, we employed the C4.5 and CART algorithms, including the stepwise method of discriminant analysis, which is a well-known method for selecting features. We have built a multi-class SVMs model for credit rating using the above method and presented experimental results with data regarding S&P 500 companies.

성별을 고려한 중풍 변증진단 판별모형개발(V) (Discriminant Model V for Syndrome Differentiation Diagnosis based on Sex in Stroke Patients)

  • 강병갑;이정섭;고미미;권세혁;방옥선
    • 동의생리병리학회지
    • /
    • 제25권1호
    • /
    • pp.138-143
    • /
    • 2011
  • In spite of abundant clinical resources of stroke patients, the objective and logical data analyses or diagnostic systems were not established in oriental medicine. As a part of researches for standardization and objectification of differentiation of syndromes for stroke, in this present study, we tried to develop the statistical diagnostic tool discriminating the 4 subtypes of syndrome differentiation using the essential indices considering the sex. Discriminant analysis was carried out using clinical data collected from 1,448 stroke patients who was identically diagnosed for the syndrome differentiation subtypes diagnosed by two clinical experts with more than 3 year experiences. Empirical discriminant model(V) for different sex was constructed using 61 significant symptoms and sign indices selected by stepwise selection. We comparison. We make comparison a between discriminant model(V) and discriminant model(IV) using 33 significant symptoms and sign indices selected by stepwise selection. Development of statistical diagnostic tool discriminating 4 subtypes by sex : The discriminant model with the 24 significant indices in women and the 19 significant indices in men was developed for discriminating the 4 subtypes of syndrome differentiation including phlegm-dampness, qi-deficiency, yin-deficiency and fire-heat. Diagnostic accuracy and prediction rate of syndrome differentiation by sex : The overall diagnostic accuracy and prediction rate of 4 syndrome differentiation subtypes using 24 symptom and sign indices was 74.63%(403/540) and 68.46%(89/130) in women, 19 symptom and sign indices was 72.05%(446/619) and 70.44%(112/159) in men. These results are almost same as those of that the overall diagnostic accuracy(73.68%) and prediction rate(70.59%) are analyzed by the discriminant model(IV) using 33 symptom and sign indices selected by stepwise selection. Considering sex, the statistical discriminant model(V) with significant 24 symptom and sign indices in women and 19 symptom and sign indices in men, instead of 33 indices would be used in the field of oriental medicine contributing to the objectification of syndrome differentiation with parsimony rule.

Developing a Molecular Prognostic Predictor of a Cancer based on a Small Sample

  • Kim Inyoung;Lee Sunho;Rha Sun Young;Kim Byungsoo
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2004년도 학술발표논문집
    • /
    • pp.195-198
    • /
    • 2004
  • One Important problem in a cancer microarray study is to identify a set of genes from which a molecular prognostic indicator can be developed. In parallel with this problem is to validate the chosen set of genes. We develop in this note a K-fold cross validation procedure by combining a 'pre-validation' technique and a bootstrap resampling procedure in the Cox regression . The pre-validation technique predicts the microarray predictor of a case without having seen the true class level of the case. It was suggested by Tibshirani and Efron (2002) to avoid the possible over-fitting in the regression in which a microarray based predictor is employed. The bootstrap resampling procedure for the Cox regression was proposed by Sauerbrei and Schumacher (1992) as a means of overcoming the instability of a stepwise selection procedure. We apply this K-fold cross validation to the microarray data of 92 gastric cancers of which the experiment was conducted at Cancer Metastasis Research Center, Yonsei University. We also share some of our experience on the 'false positive' result due to the information leak.

  • PDF

회귀분석에 기초한 균등화 방법에 관한 연구 (A study on equating method based on regression analysis)

  • 조장식
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권3호
    • /
    • pp.513-521
    • /
    • 2010
  • 대부분의 대학들은 교수업적평가를 위해 강의평가제도를 실시하고 있다. 그러나 강의평가의 결과는 강좌규모, 강의형태, 개설학년, 이수구분, 평균평점 등과 같은 개설강좌의 특성에 많은 영향을 받게 된다. 따라서 이러한 각 강좌특성들이 강의평가 결과에 영향을 미치는 효과를 제거하지 않는다면, 담당교수가 강의평가 결과에 대한 공정성과 객관성을 신뢰할 수 없게 만들 정도로 심각한 편의를 갖게 된다. 따라서 강의평가의 공정성을 위해 강좌특성에 따른 편의를 제거하기 위한 사후조정된 점수가 요구된다. 따라서 본 연구에서는 단계적 변수선택법에 의한 회귀분석을 이용하여 강의평가 결과에 대한 균등화 방법을 이용하여 사후조정된 점수를 계산하는 방법을 제안한다. 그리고 제안된 방법은 기존의 방법과 비교를 하였다.

Selection of markers in the framework of multivariate receiver operating characteristic curve analysis in binary classification

  • Sameera, G;Vishnu, Vardhan R
    • Communications for Statistical Applications and Methods
    • /
    • 제26권2호
    • /
    • pp.79-89
    • /
    • 2019
  • Classification models pertaining to receiver operating characteristic (ROC) curve analysis have been extended from univariate to multivariate setup by linearly combining available multiple markers. One such classification model is the multivariate ROC curve analysis. However, not all markers contribute in a real scenario and may mask the contribution of other markers in classifying the individuals/objects. This paper addresses this issue by developing an algorithm that helps in identifying the important markers that are significant and true contributors. The proposed variable selection framework is supported by real datasets and a simulation study, it is shown to provide insight about the individual marker's significance in providing a classifier rule/linear combination with good extent of classification.

Investigation of Chemical Sensor Array Optimization Methods for DADSS

  • Choi, Jang-Sik;Jeon, Jin-Young;Byun, Hyung-Gi
    • 센서학회지
    • /
    • 제25권1호
    • /
    • pp.13-19
    • /
    • 2016
  • Nowadays, most major automobile manufacturers are very interested, and actively involved, in developing driver alcohol detection system for safety (DADSS) that serves to prevent driving under the influence. DADSS measures the blood alcohol concentration (BAC) from the driver's breath and limits the ignition of the engine of the vehicle if the BAC exceeds the reference value. In this study, to optimize the sensor array of the DADSS, we selected sensors by using three different methods, configured the sensor arrays, and then compared their performance. The Wilks' lambda, stepwise elimination and filter method (using a principal component) were used as the sensor selection methods [2,3]. We compared the performance of the arrays, by using the selectivity and sensitivity as criteria, and Sammon mapping for the analysis of the cluster type of each gas. The sensor array configured by using the stepwise elimination method exhibited the highest sensitivity and selectivity and yielded the best visual result after Sammon mapping.

수정 결정계수를 사용한 로지스틱 회귀모형에서의 변수선택법 (Variable Selection for Logistic Regression Model Using Adjusted Coefficients of Determination)

  • 홍종선;함주형;김호일
    • 응용통계연구
    • /
    • 제18권2호
    • /
    • pp.435-443
    • /
    • 2005
  • 로지스틱 회귀모형에서 결정계수는 선형 회귀모형보다 다양하게 정의되며 그 값들도 매우 작아 로지스틱 회귀모형 평가기준으로 사용되는 통계량이 라고 할 수 없다. Liao와 McGee(2003)는 부적절한 설명변수의 추가 또는 표본크기의 변화에 민감하지 않은 두 종류의 수정 결정계수를 제안하였다. 본 연구에서는 실제자료에 적용한 로지스틱 회귀모형에서 수정 결정계수를 포함한 네 종류의 결정계수들을 변수선택의 기준으로 사용하여 기존의 변수선택 방법인 전진선택, 후진제거, 단계적 선택방법, AIC 통계량 등을 사용한 방법들과 비교하여 그 적절함과 효율성을 토론한다.