• 제목/요약/키워드: Subspace methods

검색결과 160건 처리시간 0.022초

사례 선택 기법을 활용한 앙상블 모형의 성능 개선 (Improving an Ensemble Model Using Instance Selection Method)

  • 민성환
    • 산업경영시스템학회지
    • /
    • 제39권1호
    • /
    • pp.105-115
    • /
    • 2016
  • Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.

재무부실화 예측을 위한 랜덤 서브스페이스 앙상블 모형의 최적화 (Optimization of Random Subspace Ensemble for Bankruptcy Prediction)

  • 민성환
    • 한국IT서비스학회지
    • /
    • 제14권4호
    • /
    • pp.121-135
    • /
    • 2015
  • Ensemble classification is to utilize multiple classifiers instead of using a single classifier. Recently ensemble classifiers have attracted much attention in data mining community. Ensemble learning techniques has been proved to be very useful for improving the prediction accuracy. Bagging, boosting and random subspace are the most popular ensemble methods. In random subspace, each base classifier is trained on a randomly chosen feature subspace of the original feature space. The outputs of different base classifiers are aggregated together usually by a simple majority vote. In this study, we applied the random subspace method to the bankruptcy problem. Moreover, we proposed a method for optimizing the random subspace ensemble. The genetic algorithm was used to optimize classifier subset of random subspace ensemble for bankruptcy prediction. This paper applied the proposed genetic algorithm based random subspace ensemble model to the bankruptcy prediction problem using a real data set and compared it with other models. Experimental results showed the proposed model outperformed the other models.

Tutorial: Dimension reduction in regression with a notion of sufficiency

  • Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • 제23권2호
    • /
    • pp.93-103
    • /
    • 2016
  • In the paper, we discuss dimension reduction of predictors ${\mathbf{X}}{\in}{{\mathbb{R}}^p}$ in a regression of $Y{\mid}{\mathbf{X}}$ with a notion of sufficiency that is called sufficient dimension reduction. In sufficient dimension reduction, the original predictors ${\mathbf{X}}$ are replaced by its lower-dimensional linear projection without loss of information on selected aspects of the conditional distribution. Depending on the aspects, the central subspace, the central mean subspace and the central $k^{th}$-moment subspace are defined and investigated as primary interests. Then the relationships among the three subspaces and the changes in the three subspaces for non-singular transformation of ${\mathbf{X}}$ are studied. We discuss the two conditions to guarantee the existence of the three subspaces that constrain the marginal distribution of ${\mathbf{X}}$ and the conditional distribution of $Y{\mid}{\mathbf{X}}$. A general approach to estimate them is also introduced along with an explanation for conditions commonly assumed in most sufficient dimension reduction methodologies.

다변량회귀에서 정보적 설명 변수 공간의 추정과 투영-재표본 정보적 설명 변수 공간 추정의 고찰 (Note on the estimation of informative predictor subspace and projective-resampling informative predictor subspace)

  • 유재근
    • 응용통계연구
    • /
    • 제35권5호
    • /
    • pp.657-666
    • /
    • 2022
  • 정보적 설명 변수 공간은 일반적인 충분차원축소 방법들이 요구하는 가정들이 만족하지 않을 때 중심부분공간을 추정하기 위해 유용하다. 최근 Ko와 Yoo (2022)는 다변량 회귀에서 Li 등 (2008)이 제시한 투영-재표본 방법론을 사용하여 정보적 설명 변수 공간이 아닌 투영-재표본 정보적 설명 변수 공간을 새로이 정의하였다. 이 공간은 기존의 정보적 설명 변수 공간에 포함되지만 중심 부분 공간을 포함한다. 본 논문에서는 다변량 회귀에서 정보적 설명 변수 공간을 직접적으로 추정할 수 있는 방법을 제안하고, 이를 Ko와 Yoo (2022)가 제시한 방법과 이론적으로 그리고 모의실험을 통해 비교하고자 한다. 모의실험에 따르면 Ko-Yoo 방법론이 본 논문에서 제시한 추정 방법보다 더 정확하게 중심 부분 공간을 추정하고, 추정값들의 변동이 적다는 측면에서 보다 더 효율적임을 알 수 있다.

비압축성 Navier-Stokes 방정식에 대한 Krylov 부공간법의 적용 (Application of the Krylov Subspace Method to the Incompressible Navier-Stokes Equations)

  • 맹주성;최일곤;임연우
    • 대한기계학회논문집B
    • /
    • 제24권7호
    • /
    • pp.907-915
    • /
    • 2000
  • The preconditioned Krylov subspace methods were applied to the incompressible Navier-Stoke's equations for convergence acceleration. Three of the Krylov subspace methods combined with the five of the preconditioners were tested to solve the lid-driven cavity flow problem. The MILU preconditioned CG method showed very fast and stable convergency. The combination of GMRES/MILU-CG solver for momentum and pressure correction equations was found less dependency on the number of the grid points among them. A guide line for stopping inner iterations for each equation is offered.

PARALLEL PERFORMANCE OF MULTISPLITTING METHODS WITH PREWEIGHTING

  • Han, Yu-Du;Yun, Jae-Heon
    • 대한수학회지
    • /
    • 제49권4호
    • /
    • pp.805-827
    • /
    • 2012
  • In this paper, we first study convergence of a special type of multisplitting methods with preweighting, and then we provide some comparison results of those multisplitting methods. Next, we propose both parallel implementation of an SOR-like multisplitting method with preweighting and an application of the SOR-like multisplitting method with preweighting to a parallel preconditioner of Krylov subspace method. Lastly, we provide parallel performance results of both the SOR-like multisplitting method with preweighting and Krylov subspace method with the parallel preconditioner to evaluate parallel efficiency of the proposed methods.

Subspace distribution clustering hidden Markov model을 위한 codebook design (Codebook design for subspace distribution clustering hidden Markov model)

  • 조영규;육동석
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 춘계 학술대회 발표논문집
    • /
    • pp.87-90
    • /
    • 2005
  • Today's state-of the-art speech recognition systems typically use continuous distribution hidden Markov models with the mixtures of Gaussian distributions. To obtain higher recognition accuracy, the hidden Markov models typically require huge number of Gaussian distributions. Such speech recognition systems have problems that they require too much memory to run, and are too slow for large applications. Many approaches are proposed for the design of compact acoustic models. One of those models is subspace distribution clustering hidden Markov model. Subspace distribution clustering hidden Markov model can represent original full-space distributions as some combinations of a small number of subspace distribution codebooks. Therefore, how to make the codebook is an important issue in this approach. In this paper, we report some experimental results on various quantization methods to make more accurate models.

  • PDF

Investigating SIR, DOC and SAVE for the Polychotomous Response

  • Lee, Hak-Bae;Lee, Hee-Min
    • Communications for Statistical Applications and Methods
    • /
    • 제19권3호
    • /
    • pp.501-506
    • /
    • 2012
  • This paper investigates the central subspace related with SIR, DOC and SAVE when the response has more than two values. The subspaces constructed by SIR, DOC and SAVE are investigated and compared. The SAVE paradigm is the most comprehensive. In addition, the SAVE coincides with the central subspace when the conditional distribution of predictors given the response is normally distributed.

고유모드 계산을 위한 초기 반복벡터의 효율성 연구 (Investigation of Efficiency of Starting Iteration Vectors for Calculating Natural Modes)

  • 김병완;경조현;홍사영;조석규;이인원
    • 한국소음진동공학회논문집
    • /
    • 제15권1호
    • /
    • pp.112-117
    • /
    • 2005
  • Two modified versions of subspace iteration method using accelerated starting vectors are proposed to efficiently calculate free vibration modes of structures. Proposed methods employ accelerated Lanczos vectors as starting iteration vectors in order to accelerate the convergence of the subspace iteration method. Proposed methods are divided into two forms according to the number of starting vectors. The first method composes 2p starting vectors when the number of required modes is p and the second method uses 1.5p starting vectors. To investigate the efficiency of proposed methods, two numerical examples are presented.

Comparison of black and gray box models of subspace identification under support excitations

  • Datta, Diptojit;Dutta, Anjan
    • Structural Monitoring and Maintenance
    • /
    • 제4권4호
    • /
    • pp.365-379
    • /
    • 2017
  • This paper presents a comparison of the black-box and the physics based derived gray-box models for subspace identification for structures subjected to support-excitation. The study compares the damage detection capabilities of both these methods for linear time invariant (LTI) systems as well as linear time-varying (LTV) systems by extending the gray-box model for time-varying systems using short-time windows. The numerically simulated IASC-ASCE Phase-I benchmark building has been used to compare the two methods for different damage scenarios. The efficacy of the two methods for the identification of stiffness parameters has been studied in the presence of different levels of sensor noise to simulate on-field conditions. The proposed extension of the gray-box model for LTV systems has been shown to outperform the black-box model in capturing the variation in stiffness parameters for the benchmark building.