• Title/Summary/Keyword: 분류기 결합

Search Result 248, Processing Time 0.034 seconds

Improving Korean Part-of-Speech Tagging Using The Lexical Specific Classifier (어휘별 분류기를 이용한 한국어 품사 부착의 성능 향상)

  • Choi, Won-Jong;Lee, Do-Gil;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.133-139
    • /
    • 2006
  • 한국어 형태소 분석 및 품사 부착을 위해 지금까지 다양한 모델들이 제안이 되었으며 어절단위 평가로 95%를 넘는 성능을 보여주는 자동 태거가 보고 되었다. 하지만 형태소 분석 및 품사 부착은 모든 자연어처리 시스템의 성능에 큰 영향을 미치므로 작은 오류도 중요하다. 본 연구에서는 대상 어절의 주변 형태소의 어휘와 품사 자질, 그리고 어절 자질을 이용하여 분류기를 학습한 후 자동 태거의 품사 부착 결과를 입력으로 받아 후처리 하는 어휘별 분류기를 제안한다. 실험 결과 어휘별 분류기를 이용한 후처리만으로 어절단위 평가 6.86%$(95.251%{\rightarrow}95.577%)$의 오류가 감소하는 성능향상을 얻었으며, 기존에 제안된 품사별 자질을 이용한 후처리 방법과 순차 결합할 경우 16.91%$(95.251%{\rightarrow}96.054%)$의 오류가 감소하는 성능 향상을 얻을 수 있었다. 특히 본 논문에서 제안하는 방법은 형태소 어휘까지 정정할 수 있기 때문에 품사별 자질을 이용한 후처리 방법의 성능을 더욱 향상시킬 수 있다.

  • PDF

Study of Joint Histogram Based Statistical Features for Early Detection of Lung Disease (폐질환 조기 검출을 위한 결합 히스토그램 기반의 통계적 특징 인자에 대한 연구)

  • Won, Chul-ho
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.10 no.4
    • /
    • pp.259-265
    • /
    • 2016
  • In this paper, new method was proposed to classify lung tissues such as Broncho vascular, Emphysema, Ground Glass Reticular, Ground Glass, Honeycomb, Normal for early lung disease detection. 459 Statistical features was extraced from joint histogram matrix based on multi resolution analysis, volumetric LBP, and CT intensity, then dominant features was selected by using adaboost learning. Accuracy of proposed features and 3D AMFM was 90.1% and 85.3%, respectively. Proposed joint histogram based features shows better classification result than 3D AMFM in terms of accuracy, sensitivity, and specificity.

Improving an Ensemble Model by Optimizing Bootstrap Sampling (부트스트랩 샘플링 최적화를 통한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Internet Computing and Services
    • /
    • v.17 no.2
    • /
    • pp.49-57
    • /
    • 2016
  • Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving prediction accuracy. Bagging is one of the most popular ensemble learning techniques. Bagging has been known to be successful in increasing the accuracy of prediction of the individual classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then combines the predictions of these classifiers to get the final classification result. Bootstrap samples are simple random samples selected from the original training data, so not all bootstrap samples are equally informative, due to the randomness. In this study, we proposed a new method for improving the performance of the standard bagging ensemble by optimizing bootstrap samples. A genetic algorithm is used to optimize bootstrap samples of the ensemble for improving prediction accuracy of the ensemble model. The proposed model is applied to a bankruptcy prediction problem using a real dataset from Korean companies. The experimental results showed the effectiveness of the proposed model.

Application of Bayesian Probability Rule to the Combination of Spectral and Temporal Contextual Information in Land-cover Classification (토지 피복 분류에서 분광 영상정보와 시간 문맥 정보의 결합을 위한 베이지안 확률 규칙의 적용)

  • Lee, Sang-Won;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.4
    • /
    • pp.445-455
    • /
    • 2011
  • A probabilistic classification framework is presented that can combine temporal contextual information derived from an existing land-cover map in order to improve the classification accuracy of land-cover classes that can not be discriminated well when using spectral information only. The transition probability is computed by using the existing land-cover map and training data, and considered as a priori probability. By combining the a priori probability with conditional probability computed from spectral information via a Bayesian combination rule, the a posteriori probability is finally computed and then the final land-cover types are determined. The method presented in this paper can be adopted to any probabilistic classification algorithms in a simple way, compared with conventional classification methods that require heavy computational loads to incorporate the temporal contextual information. A case study for crop classification using time-series MODIS data sets is carried out to illustrate the applicability of the presented method. The classification accuracies of the land-cover classes, which showed lower classification accuracies when using only spectral information due to the low resolution MODIS data, were much improved by combining the temporal contextual information. It is expected that the presented probabilistic method would be useful both for updating the existing past land-cover maps, and for improving the classification accuracy.

Recognition of Handwritten Numerals using SVM Classifiers (SVM 분류기를 이용한 필기체 숫자인식)

  • Park, Joong-Jo;Kim, Kyoung-Min
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.8 no.3
    • /
    • pp.136-142
    • /
    • 2007
  • Recent researches in the recognition system have shown that SVM (Support Vector Machine) classifiers often have superior recognition rates in comparison to other classifiers. In this paper, we present the handwritten numeral recognition algorithm using SVM classifiers. The numeral features used in our algorithm are mesh features, directional features by Kirsch operators and concavity features, where first two features represent the foreground information of numerals and the last feature represents the background information of numerals. These features are complements each of the other. Since SVM is basically a binary classifier, it is required to construct and combine several binary SVMs to get the multi-class classifiers. We use two strategies for implementing multi-class SVM classifiers: "one against one" and "one against the rest", and examine their performances on the features used. The efficiency of our method is tested by the CENPARMI handwritten numeral database, and the recognition rate of 98.45% is achieved.

  • PDF

On Optimizing Dissimilarity-Based Classifications Using a DTW and Fusion Strategies (DTW와 퓨전기법을 이용한 비유사도 기반 분류법의 최적화)

  • Kim, Sang-Woon;Kim, Seung-Hwan
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.2
    • /
    • pp.21-28
    • /
    • 2010
  • This paper reports an experimental result on optimizing dissimilarity-based classification(DBC) by simultaneously using a dynamic time warping(DTW) and a multiple fusion strategy(MFS). DBC is a way of defining classifiers among classes; they are not based on the feature measurements of individual samples, but rather on a suitable dissimilarity measure among the samples. In DTW, the dissimilarity is measured in two steps: first, we adjust the object samples by finding the best warping path with a correlation coefficient-based DTW technique. We then compute the dissimilarity distance between the adjusted objects with conventional measures. In MFS, fusion strategies are repeatedly used in generating dissimilarity matrices as well as in designing classifiers: we first combine the dissimilarity matrices obtained with the DTW technique to a new matrix. After training some base classifiers in the new matrix, we again combine the results of the base classifiers. Our experimental results for well-known benchmark databases demonstrate that the proposed mechanism achieves further improved results in terms of classification accuracy compared with the previous approaches. From this consideration, the method could also be applied to other high-dimensional tasks, such as multimedia information retrieval.

A Study on Facial Expression Recognition using Boosted Local Binary Pattern (Boosted 국부 이진 패턴을 적용한 얼굴 표정 인식에 관한 연구)

  • Won, Chulho
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.12
    • /
    • pp.1357-1367
    • /
    • 2013
  • Recently, as one of images based methods in facial expression recognition, the research which used ULBP block histogram feature and SVM classifier was performed. Due to the properties of LBP introduced by Ojala, such as highly distinction capability, durability to the illumination changes and simple operation, LBP is widely used in the field of image recognition. In this paper, we combined $LBP_{8,2}$ and $LBP_{8,1}$ to describe micro features in addition to shift, size change in calculating ULBP block histogram. From sub-windows of 660 of $LBP_{8,1}$ and 550 of $LBP_{8,2}$, ULBP histogram feature of 1210 were extracted and weak classifiers of 50 were generated using AdaBoost. By using the combined $LBP_{8,1}$ and $LBP_{8,2}$ hybrid type of ULBP histogram feature and SVM classifier, facial expression recognition rate could be improved and it was confirmed through various experiments. Facial expression recognition rate of 96.3% by hybrid boosted ULBP block histogram showed the superiority of the proposed method.

Context-Aware Fusion with Support Vector Machine (Support Vector Machine을 이용한 문맥 인지형 융합)

  • Heo, Gyeong-Yong;Kim, Seong-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.6
    • /
    • pp.19-26
    • /
    • 2014
  • An ensemble classifier system is a widely-used multi-classifier system, which combines the results from each classifier and, as a result, achieves better classification result than any single classifier used. Several methods have been used to build an ensemble classifier including boosting, which is a cascade method where misclassified examples in previous stage are used to boost the performance in current stage. Boosting is, however, a serial method which does not form a complete feedback loop. In this paper, proposed is context sensitive SVM ensemble (CASE) which adopts SVM, one of the best classifiers in term of classification rate, as a basic classifier and clustering method to divide feature space into contexts. As CASE divides feature space and trains SVMs simultaneously, the result from one component can be applied to the other and CASE achieves better result than boosting. Experimental results prove the usefulness of the proposed method.

Feature Selection for Multiple K-Nearest Neighbor classifiers using GAVaPS (GAVaPS를 이용한 다수 K-Nearest Neighbor classifier들의 Feature 선택)

  • Lee, Hee-Sung;Lee, Jae-Hun;Kim, Eun-Tai
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.6
    • /
    • pp.871-875
    • /
    • 2008
  • This paper deals with the feature selection for multiple k-nearest neighbor (k-NN) classifiers using Genetic Algorithm with Varying reputation Size (GAVaPS). Because we use multiple k-NN classifiers, the feature selection problem for them is vary hard and has large search region. To solve this problem, we employ the GAVaPS which outperforms comparison with simple genetic algorithm (SGA). Further, we propose the efficient combining method for multiple k-NN classifiers using GAVaPS. Experiments are performed to demonstrate the efficiency of the proposed method.