• Title/Summary/Keyword: Dynamic Classifier selection

Search Result 8, Processing Time 0.022 seconds

A Multiple Classifier System based on Dynamic Classifier Selection having Local Property (지역적 특성을 갖는 동적 선택 방법에 기반한 다중 인식기 시스템)

  • 송혜정;김백섭
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.339-346
    • /
    • 2003
  • This paper proposes a multiple classifier system having massive micro classifiers. The micro classifiers are trained by using a local set of training patterns. The k nearest neighboring training patterns of one training pattern comprise the local region for training a micro classifier. Each training pattern is incorporated with one or more micro classifiers. Two types of micro classifiers are adapted in this paper. SVM with linear kernel and SVM with RBF kernel. Classification is done by selecting the best micro classifier among the micro classifiers in vicinity of incoming test pattern. To measure the goodness of each micro classifier, the weighted sum of correctly classified training patterns in vicinity of the test pattern is used. Experiments have been done on Elena database. Results show that the proposed method gives better classification accuracy than any conventional classifiers like SVM, k-NN and the conventional classifier combination/selection scheme.

Classifier Selection using Feature Space Attributes in Local Region (국부적 영역에서의 특징 공간 속성을 이용한 다중 인식기 선택)

  • Shin Dong-Kuk;Song Hye-Jeong;Kim Baeksop
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1684-1690
    • /
    • 2004
  • This paper presents a method for classifier selection that uses distribution information of the training samples in a small region surrounding a sample. The conventional DCS-LA(Dynamic Classifier Selection - Local Accuracy) selects a classifier dynamically by comparing the local accuracy of each classifier at the test time, which inevitably requires long classification time. On the other hand, in the proposed approach, the best classifier in a local region is stored in the FSA(Feature Space Attribute) table during the training time, and the test is done by just referring to the table. Therefore, this approach enables fast classification because classification is not needed during test. Two feature space attributes are used entropy and density of k training samples around each sample. Each sample in the feature space is mapped into a point in the attribute space made by two attributes. The attribute space is divided into regular rectangular cells in which the local accuracy of each classifier is appended. The cells with associated local accuracy comprise the FSA table. During test, when a test sample is applied, the cell to which the test sample belongs is determined first by calculating the two attributes, and then, the most accurate classifier is chosen from the FSA table. To show the effectiveness of the proposed algorithm, it is compared with the conventional DCS -LA using the Elena database. The experiments show that the accuracy of the proposed algorithm is almost same as DCS-LA, but the classification time is about four times faster than that.

Structural Damage Assessment Based on PNN -Application to Railway Bridge (확률신경망을 이용한 구조물 손상평가-철도교 적용)

  • 조효남;이성칠;오달수;최윤석
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2002.10a
    • /
    • pp.321-329
    • /
    • 2002
  • Artificial neural network has been used for damage assessment by many researchers, but there are still some barriers that must be overcome to improve its accuracy and efficiency. The major problems with the conventional neural network are the necessity of many training patterns for neural network teaming process and ambiguity in the relationship of neural network structure to the convergence of solution. In this paper, the PNN is used as a pattern classifier to detect the damages of the railway bridge using dynamic response. The comparison between the mode shape and the natural frequency of structure as training pattern is investigated for approriate selection of the training pattern in the damage detection of railway bridge using the PNN.

  • PDF

Dynamic Classifier Selection Using Self-Organizing Maps (자기조직화지도를 이용한 동적 분류기 선택(1))

  • 이관희;이일병
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.250-252
    • /
    • 2003
  • 패턴 인식 분야에서 다중 분류기 시스템은 여러 분류기의 결과들을 조합하여 전체 성능을 항상 시키는 시스템이다. 다중 분류기를 사용함으로써 단일 분류기 보다 더 나은 결과를 얻을 수 있음은 이미 널리 알려진 사실이다. 서로 다른 구조를 갖는 분류기들은 상호 보완적인 정보를 제공하기 때문에 각 분류기마다 입력 공간에 대해서 지역적으로 좋은 성능을 갖는다. 본 논문에서는 지역적으로 가장 좋은 성능을 보이는 분류기 선택 방법을 제안한다. 제안하는 방법은 주어진 입력 공간에 비해 각 분류기들을 학습하는 과정에서 자기조직화지도를 생성하고 각 노드별로 평가함으로써 입력이 주어지면, 해당 노드에서 가장 성능이 좋은 분류기를 선택하여 전체 성능을 향상시키는 시스템이다.

  • PDF

A Three-Step Preprocessing Algorithm for Enhanced Classification of E-Mail Recommendation System (이메일 추천 시스템의 분류 향상을 위한 3단계 전처리 알고리즘)

  • Jeong Ok-Ran;Cho Dong-Sub
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.4
    • /
    • pp.251-258
    • /
    • 2005
  • Automatic document classification may differ significantly according to the characteristics of documents that are subject to classification, as well as classifier's performance. This research identifies e-mail document's characteristics to apply a three-step preprocessing algorithm that can minimize e-mail document's atypical characteristics. In the first 5go, uncertain based sampling algorithm that used Mean Absolute Deviation(MAD), is used to address the question of selection learning document for the rule generation at the time of classification. In the subsequent stage, Weighted vlaue assigning method by attribute is applied to increase the discriminating capability of the terms that appear on the title on the e-mail document characteristic level. in the third and last stage, accuracy level during classification by each category is increased by using Naive Bayesian Presumptive Algorithm's Dynamic Threshold. And, we implemented an E-Mail Recommendtion System using a three-step preprocessing algorithm the enable users for direct and optimal classification with the recommendation of the applicable category when a mail arrives.

Improving Naïve Bayes Text Classifiers with Incremental Feature Weighting (점진적 특징 가중치 기법을 이용한 나이브 베이즈 문서분류기의 성능 개선)

  • Kim, Han-Joon;Chang, Jae-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.5
    • /
    • pp.457-464
    • /
    • 2008
  • In the real-world operational environment, most of text classification systems have the problems of insufficient training documents and no prior knowledge of feature space. In this regard, $Na{\ddot{i}ve$ Bayes is known to be an appropriate algorithm of operational text classification since the classification model can be evolved easily by incrementally updating its pre-learned classification model and feature space. This paper proposes the improving technique of $Na{\ddot{i}ve$ Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of $Na{\ddot{i}ve$ Bayes considers the degree of feature importance as well as feature distribution. We can develop a more accurate classification model by incorporating feature weights into Naive Bayes learning algorithm, not performing a learning process with a reduced feature set. In addition, we have extended a conventional feature update algorithm for incremental feature weighting in a dynamic operational environment. To evaluate the proposed method, we perform the experiments using the various document collections, and show that the traditional $Na{\ddot{i}ve$ Bayes classifier can be significantly improved by the proposed technique.

Object Classification Method Using Dynamic Random Forests and Genetic Optimization

  • Kim, Jae Hyup;Kim, Hun Ki;Jang, Kyung Hyun;Lee, Jong Min;Moon, Young Shik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.5
    • /
    • pp.79-89
    • /
    • 2016
  • In this paper, we proposed the object classification method using genetic and dynamic random forest consisting of optimal combination of unit tree. The random forest can ensure good generalization performance in combination of large amount of trees by assigning the randomization to the training samples and feature selection, etc. allocated to the decision tree as an ensemble classification model which combines with the unit decision tree based on the bagging. However, the random forest is composed of unit trees randomly, so it can show the excellent classification performance only when the sufficient amounts of trees are combined. There is no quantitative measurement method for the number of trees, and there is no choice but to repeat random tree structure continuously. The proposed algorithm is composed of random forest with a combination of optimal tree while maintaining the generalization performance of random forest. To achieve this, the problem of improving the classification performance was assigned to the optimization problem which found the optimal tree combination. For this end, the genetic algorithm methodology was applied. As a result of experiment, we had found out that the proposed algorithm could improve about 3~5% of classification performance in specific cases like common database and self infrared database compare with the existing random forest. In addition, we had shown that the optimal tree combination was decided at 55~60% level from the maximum trees.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.