• Title/Summary/Keyword: classifier ensemble

Search Result 112, Processing Time 0.048 seconds

Classification of Remote Sensing Data using Random Selection of Training Data and Multiple Classifiers (훈련 자료의 임의 선택과 다중 분류자를 이용한 원격탐사 자료의 분류)

  • Park, No-Wook;Yoo, Hee Young;Kim, Yihyun;Hong, Suk-Young
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.5
    • /
    • pp.489-499
    • /
    • 2012
  • In this paper, a classifier ensemble framework for remote sensing data classification is presented that combines classification results generated from both different training sets and different classifiers. A core part of the presented framework is to increase a diversity between classification results by using both different training sets and classifiers to improve classification accuracy. First, different training sets that have different sampling densities are generated and used as inputs for supervised classification using different classifiers that show different discrimination capabilities. Then several preliminary classification results are combined via a majority voting scheme to generate a final classification result. A case study of land-cover classification using multi-temporal ENVISAT ASAR data sets is carried out to illustrate the potential of the presented classification framework. In the case study, nine classification results were combined that were generated by using three different training sets and three different classifiers including maximum likelihood classifier, multi-layer perceptron classifier, and support vector machine. The case study results showed that complementary information on the discrimination of land-cover classes of interest would be extracted within the proposed framework and the best classification accuracy was obtained. When comparing different combinations, to combine any classification results where the diversity of the classifiers is not great didn't show an improvement of classification accuracy. Thus, it is recommended to ensure the greater diversity between classifiers in the design of multiple classifier systems.

BKS Fusion of Classifier Ensemble for Prediction of Diabetes (당뇨병의 예측을 위한 분류기 앙상블의 BKS 결합)

  • 박한샘;조성배
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.265-267
    • /
    • 2004
  • 경제 여건의 향상 및 생활양식의 변화로 최근 우리나라에서도 당뇨병 환자가 늘어남에 따라 당뇨병의 예측 및 치료가 중요한 관심사가 되고 있다. 본 논문은 1993년과 1995년 두 차례에 걸쳐 경기도 연천 지역 주민들의 여러 가지 신체 지수 등을 조사한 데이터를 대상으로, 1차 년도의 데이터로부터 동일한 환자가 2차 년도에 정상상태를 유지하는지 흑은 당뇨병으로 진행이 되는지를 예측하는 문제를 다룬다. 혈당량, 허리둘레 등의 수치가 당뇨병의 발병에 영향을 끼치는 것은 알려진 사실이므로, 현재의 데이터로부터 앞으로의 발병 가능성을 예측하는 것이 가능하며, 이는 환자에게 보다 정확한 정보를 알려줄 수 있으므로 의미가 있는 일이다. 예측을 위해 본 논문에서는 분류기를 사용하며, 예측율을 높이기 위해 여러 분류기를 BKS로 결합하였다. BKS (behavior knowledge space) 결합 방법은 분류기간의 독립 가정이 필요 없으며, 데이터 크기가 크고 전형적인 경우에 좋은 결과를 낼 수 있는 방법이다. BKS 결합 방법을 통해 실험을 해본 결과 단일 분류기로 실험을 한 결과보다 향상된 성능을 얻을 수 있었으며, 투표 결합 방법과 비교하여 더 좋은 성능을 보였다.

  • PDF

Optimal Classifier Ensemble for Lymphoma Cancer Using Genetic Algorithm (유전자 알고리즘을 이용한 림프종 암의 최적 분류기 앙상블)

  • 박찬호;조성배
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.356-358
    • /
    • 2003
  • DNA microarray기술의 발달로 한꺼번에 수천 개 유전자의 발현 정보를 얻는 것이 가능해졌는데, 이렇게 얻어진 데이터를 효과적으로 분류하는 시스템을 만들어놓으면 새로운 샘플이 정상상태인지, 질병을 가진 상태인지 예측할 수 있다. 분류 시스템을 위하여 여러 가지 특징선택방법들과 분류기법들을 사용할 수 있는데, 모든 상황에서 항상 뛰어난 성능을 보이는 특징선택법이나 분류기를 찾기는 힘들다. 안정되고 개선된 성능을 내기 위해서 특징-분류기의 앙상블을 이용할 수 있는데, 앙상블에 이용될 수 있는 특징선택 방법이나 분류기의 수가 많다면, 앙상블을 만들 수 있는 조합이 많아지기 때문에, 모든 조합에 대하여 앙상블 결과를 구하기는 거의 불가능하다. 이를 해결하기 위하여 본 논문에서는 유전자알고리즘을 이용하여 모든 앙상블 결과를 계산하지 않으면서 최적의 앙상블을 찾아내는 방법을 제안하였으며, 실제로 림프종 암 데이터에 적용한 결과 100%의 결합결과를 보이는 최적의 앙상블을 효과적으로 찾아내었다.

  • PDF

An Experimental Study on Categorization of Web Documents Using an Ensemble Classifier (복합 분류기를 이용한 웹 문서 범주화에 관한 실험적 연구)

  • 이혜원;정영미
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2003.08a
    • /
    • pp.73-82
    • /
    • 2003
  • 본 연구에서는 웹 문서를 분류하기 위해 문서로부터 다양한 자질을 추출하고, 두 가지의 분류기를 통해 여러 개의 분류 예측치를 구한 다음, 그것들을 하나의 결과물로 통합하는 복합분류기를 사용하였다. 먼저 다양한 자질 집합에 대해 일반적으로 많이 사용되는 kNN(k nearest neighbor) 분류기와 나이브 베이즈(Naive Bayes) 분류기를 사용한 범주화 실험을 수행하고, 실험을 통해 나온 범주 예측치를 통합하는 복합 분류기들의 성능을 비교하였다. 또한 단일 분류기들을 통해 나온 모든 범주 예측치를 통합하는 과정을 수행하여, 단일 분류기만을 사용할 경우와 복합 분류기를 사용할 경우를 비교해 더 좋은 성능을 나타내는 분류기를 밝히고자 한다.

  • PDF

Empirical Evaluation of Ensemble Approach for Diagnostic Knowledge Management (진단지식관리를 위한 앙상블 기법의 실증적 평가)

  • Ha, Sung-Ho;Zhang, Zhen-Yu
    • The Journal of Information Systems
    • /
    • v.20 no.3
    • /
    • pp.237-255
    • /
    • 2011
  • 지난 수십 년 간 연구자들은 효과적인 진료지원시스템을 개발하기 위해 다양한 도구와 방법론들을 제안하였고 지금도 새로운 방법론과 도구들을 계속적으로 개발하고 있다. 그 중에서 흉통으로 응급실에 내원한 노인환자에 대한 정확한 진단은 중요한 이슈 중의 하나였다. 따라서 많은 연구자들이 의사의 진단 능력을 향상시키기 위한 지능적인 의료의사결정과 시스템 개발에 투신하고 있지만 전통적인 의료시스템에 따른 대부분의 진료의사결정이 단일 분류기(classifier)에 기반하고 있어 만족스런 성능을 보여주지 못하고 있는 것이 현실이다. 따라서 이 논문은 앙상블 전략을 활용하여 의사들이 노인환자들의 흉통을 더 정확하고 빠르게 진단하는데 있어 도움을 줄 수 있게 하였다. 의사결정나무, 인공신경망, SVM 모델을 결합한 앙상블 기법을 실제 응급실에서 수집한 응급실 자료에 적용하였고, 그 결과 단일 분류기를 사용하는 것에 비해 월등히 향상된 진단 성과를 보이는 것을 관찰 할 수 있었다.

Neural Networks-Based Method for Electrocardiogram Classification

  • Maksym Kovalchuk;Viktoriia Kharchenko;Andrii Yavorskyi;Igor Bieda;Taras Panchenko
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.186-191
    • /
    • 2023
  • Neural Networks are widely used for huge variety of tasks solution. Machine Learning methods are used also for signal and time series analysis, including electrocardiograms. Contemporary wearable devices, both medical and non-medical type like smart watch, allow to gather the data in real time uninterruptedly. This allows us to transfer these data for analysis or make an analysis on the device, and thus provide preliminary diagnosis, or at least fix some serious deviations. Different methods are being used for this kind of analysis, ranging from medical-oriented using distinctive features of the signal to machine learning and deep learning approaches. Here we will demonstrate a neural network-based approach to this task by building an ensemble of 1D CNN classifiers and a final classifier of selection using logistic regression, random forest or support vector machine, and make the conclusions of the comparison with other approaches.

Object Classification Method Using Dynamic Random Forests and Genetic Optimization

  • Kim, Jae Hyup;Kim, Hun Ki;Jang, Kyung Hyun;Lee, Jong Min;Moon, Young Shik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.5
    • /
    • pp.79-89
    • /
    • 2016
  • In this paper, we proposed the object classification method using genetic and dynamic random forest consisting of optimal combination of unit tree. The random forest can ensure good generalization performance in combination of large amount of trees by assigning the randomization to the training samples and feature selection, etc. allocated to the decision tree as an ensemble classification model which combines with the unit decision tree based on the bagging. However, the random forest is composed of unit trees randomly, so it can show the excellent classification performance only when the sufficient amounts of trees are combined. There is no quantitative measurement method for the number of trees, and there is no choice but to repeat random tree structure continuously. The proposed algorithm is composed of random forest with a combination of optimal tree while maintaining the generalization performance of random forest. To achieve this, the problem of improving the classification performance was assigned to the optimization problem which found the optimal tree combination. For this end, the genetic algorithm methodology was applied. As a result of experiment, we had found out that the proposed algorithm could improve about 3~5% of classification performance in specific cases like common database and self infrared database compare with the existing random forest. In addition, we had shown that the optimal tree combination was decided at 55~60% level from the maximum trees.

Human Action Recognition in Still Image Using Weighted Bag-of-Features and Ensemble Decision Trees (가중치 기반 Bag-of-Feature와 앙상블 결정 트리를 이용한 정지 영상에서의 인간 행동 인식)

  • Hong, June-Hyeok;Ko, Byoung-Chul;Nam, Jae-Yeal
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38A no.1
    • /
    • pp.1-9
    • /
    • 2013
  • This paper propose a human action recognition method that uses bag-of-features (BoF) based on CS-LBP (center-symmetric local binary pattern) and a spatial pyramid in addition to the random forest classifier. To construct the BoF, an image divided into dense regular grids and extract from each patch. A code word which is a visual vocabulary, is formed by k-means clustering of a random subset of patches. For enhanced action discrimination, local BoF histogram from three subdivided levels of a spatial pyramid is estimated, and a weighted BoF histogram is generated by concatenating the local histograms. For action classification, a random forest, which is an ensemble of decision trees, is built to model the distribution of each action class. The random forest combined with the weighted BoF histogram is successfully applied to Standford Action 40 including various human action images, and its classification performance is better than that of other methods. Furthermore, the proposed method allows action recognition to be performed in near real-time.

Web Mining Using Fuzzy Integration of Multiple Structure Adaptive Self-Organizing Maps (다중 구조적응 자기구성지도의 퍼지결합을 이용한 웹 마이닝)

  • 김경중;조성배
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.1
    • /
    • pp.61-70
    • /
    • 2004
  • It is difficult to find an appropriate web site because exponentially growing web contains millions of web documents. Personalization of web search can be realized by recommending proper web sites using user profile but more efficient method is needed for estimating preference because user's evaluation on web contents presents many aspects of his characteristics. As user profile has a property of non-linearity, estimation by classifier is needed and combination of classifiers is necessary to anticipate diverse properties. Structure adaptive self-organizing map (SASOM) that is suitable for Pattern classification and visualization is an enhanced model of SOM and might be useful for web mining. Fuzzy integral is a combination method using classifiers' relevance that is defined subjectively. In this paper, estimation of user profile is conducted by using ensemble of SASOM's teamed independently based on fuzzy integral and evaluated by Syskill & Webert UCI benchmark data. Experimental results show that the proposed method performs better than previous naive Bayes classifier as well as voting of SASOM's.

Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction (기업부실 예측 데이터의 불균형 문제 해결을 위한 앙상블 학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.3
    • /
    • pp.1-15
    • /
    • 2009
  • In a classification problem, data imbalance occurs when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. This paper proposes a Geometric Mean-based Boosting (GM-Boost) to resolve the problem of data imbalance. Since GM-Boost introduces the notion of geometric mean, it can perform learning process considering both majority and minority sides, and reinforce the learning on misclassified data. An empirical study with bankruptcy prediction on Korea companies shows that GM-Boost has the higher classification accuracy than previous methods including Under-sampling, Over-Sampling, and AdaBoost, used in imbalanced data and robust learning performance regardless of the degree of data imbalance.

  • PDF