• Title/Summary/Keyword: 단일분류

Search Result 786, Processing Time 0.027 seconds

Coarse-to-fine Classifier Ensemble Selection using Clustering and Genetic Algorithms (군집화와 유전 알고리즘을 이용한 거친-섬세한 분류기 앙상블 선택)

  • Kim, Young-Won;Oh, Il-Seok
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.9
    • /
    • pp.857-868
    • /
    • 2007
  • The good classifier ensemble should have a high complementarity among classifiers in order to produce a high recognition rate and its size is small in order to be efficient. This paper proposes a classifier ensemble selection algorithm with coarse-to-fine stages. for the algorithm to be successful, the original classifier pool should be sufficiently diverse. This paper produces a large classifier pool by combining several different classification algorithms and lots of feature subsets. The aim of the coarse selection is to reduce the size of classifier pool with little sacrifice of recognition performance. The fine selection finds near-optimal ensemble using genetic algorithms. A hybrid genetic algorithm with improved searching capability is also proposed. The experimentation uses the worldwide handwritten numeral databases. The results showed that the proposed algorithm is superior to the conventional ones.

A study on end-to-end speaker diarization system using single-label classification (단일 레이블 분류를 이용한 종단 간 화자 분할 시스템 성능 향상에 관한 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.536-543
    • /
    • 2023
  • Speaker diarization, which labels for "who spoken when?" in speech with multiple speakers, has been studied on a deep neural network-based end-to-end method for labeling on speech overlap and optimization of speaker diarization models. Most deep neural network-based end-to-end speaker diarization systems perform multi-label classification problem that predicts the labels of all speakers spoken in each frame of speech. However, the performance of the multi-label-based model varies greatly depending on what the threshold is set to. In this paper, it is studied a speaker diarization system using single-label classification so that speaker diarization can be performed without thresholds. The proposed model estimate labels from the output of the model by converting speaker labels into a single label. To consider speaker label permutations in the training, the proposed model is used a combination of Permutation Invariant Training (PIT) loss and cross-entropy loss. In addition, how to add the residual connection structures to model is studied for effective learning of speaker diarization models with deep structures. The experiment used the Librispech database to generate and use simulated noise data for two speakers. When compared with the proposed method and baseline model using the Diarization Error Rate (DER) performance the proposed method can be labeling without threshold, and it has improved performance by about 20.7 %.

Real Time Monocular Navigation using VFH (단일 카메라를 이용한 VFH 기반의 실시간 주행 기술 개발)

  • Jo, Jang-won;Ju, Jin Sun;Ko, Eunjeong;Kim, Eun Yi
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.04a
    • /
    • pp.348-351
    • /
    • 2010
  • 본 논문에서는 단일 카메라로부터 주어진 영상을 실시간으로 장애물과 비장애물 영역으로 분류한 후 VFH 를 이용하여 안전한 경로를 선정하는 실시간 주행 시스템을 개발한다. 제안된 시스템은 점유 그리드맵 생성기와 VFH 기반의 선정기로 구성된다. 점유 그리드맵 생성기는 입력된 $320{\times}240$ 영상의 색조와 명도 정보를 이용하여 실시간으로 배경과 장애물 영역을 분류하고, 이를 바탕으로 위험도에 따라 10 개의 그레이 레벨을 가지는 $32{\times}24$ 의 점유 그리드맵을 생성한다. VFH를 이용하여 폴라 히스토그램을 작성한 후 밀도가 낮은 곳으로 주행 경로를 결정 한다. 제안된 기술의 효율성을 증명하기 위하여 다양한 형태의 장애물을 포함하는 실내 및 실외 환경에서 평가하였으며 센서 기반의 그 결과는 기존의 센서기반의 주행시스템과 비교 되었다. 그 결과 제안된 시스템은 88%의 정확도를 보였으며, 기존의 시스템보다 실시간으로 빠르고 안전한 주행을 수행할 수 있음이 증명되었다.

The Bi-Cross Pretraining Method to Enhance Language Representation (Bi-Cross 사전 학습을 통한 자연어 이해 성능 향상)

  • Kim, Sung-ju;Kim, Seonhoon;Park, Jinseong;Yoo, Kang Min;Kang, Inho
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.320-325
    • /
    • 2021
  • BERT는 사전 학습 단계에서 다음 문장 예측 문제와 마스킹된 단어에 대한 예측 문제를 학습하여 여러 자연어 다운스트림 태스크에서 높은 성능을 보였다. 본 연구에서는 BERT의 사전 학습 문제 중 다음 문장 예측 문제에 대해 주목했다. 다음 문장 예측 문제는 자연어 추론 문제와 질의 응답 문제와 같이 임의의 두 문장 사이의 관계를 모델링하는 문제들에 성능 향상을 위해 사용되었다. 하지만 BERT의 다음 문장 예측 문제는 두 문장을 특수 토큰으로 분리하여 단일 문자열 형태로 모델에 입력으로 주어지는 cross-encoding 방식만을 학습하기 때문에 문장을 각각 인코딩하는 bi-encoding 방식의 다운스트림 태스크를 고려하지 않은 점에서 아쉬움이 있다. 본 논문에서는 기존 BERT의 다음 문장 예측 문제를 확장하여 bi-encoding 방식의 다음 문장 예측 문제를 추가적으로 사전 학습하여 단일 문장 분류 문제와 문장 임베딩을 활용하는 문제에서 성능을 향상 시키는 Bi-Cross 사전 학습 기법을 소개한다. Bi-Cross 학습 기법은 영화 리뷰 감성 분류 데이터 셋인 NSMC 데이터 셋에 대해 학습 데이터의 0.1%만 사용하는 학습 환경에서 Bi-Cross 사전 학습 기법 적용 전 모델 대비 5점 가량의 성능 향상이 있었다. 또한 KorSTS의 bi-encoding 방식의 문장 임베딩 성능 평가에서 Bi-Cross 사전 학습 기법 적용 전 모델 대비 1.5점의 성능 향상을 보였다.

  • PDF

Hybrid Multiple Classifier Systems (하이브리드 다중 분류기시스템)

  • Kim In-cheol
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.2
    • /
    • pp.133-145
    • /
    • 2004
  • Combining multiple classifiers to obtain improved performance over the individual classifier has been a widely used technique. The task of constructing a multiple classifier system(MCS) contains two different issues : how to generate a diverse set of base-level classifiers and how to combine their predictions. In this paper, we review the characteristics of the existing multiple classifier systems: bagging, boosting, and stacking. And then we propose new MCSs: stacked bagging, stacked boosting, bagged stacking, and boasted stacking. These MCSs are a sort of hybrid MCSs that combine advantageous characteristics of the existing ones. In order to evaluate the performance of the proposed schemes, we conducted experiments with nine different real-world datasets from UCI KDD archive. The result of experiments showed the superiority of our hybrid MCSs, especially bagged stacking and boosted stacking, over the existing ones.

  • PDF

An Analytical Study on Automatic Classification of Domestic Journal articles Based on Machine Learning (기계학습에 기초한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.37-62
    • /
    • 2018
  • This study examined the factors affecting the performance of automatic classification based on machine learning for domestic journal articles in the field of LIS. In particular, In view of the classification performance that assigning automatically the class labels to the articles in "Journal of the Korean Society for Information Management", I investigated the characteristics of the key factors(weighting schemes, training set size, classification algorithms, label assigning methods) through the diversified experiments. Consequently, It is effective to apply each element appropriately according to the classification environment and the characteristics of the document set, and a fairly good performance can be obtained by using a simpler model. In addition, the classification of domestic journals can be considered as a multi-label classification that assigns more than one category to a specific article. Therefore, I proposed an optimal classification model using simple and fast classification algorithm and small learning set considering this environment.

Fault Classification for Rotating Machinery Using Support Vector Machines with Optimal Features Corresponding to Each Fault Type (결함유형별 최적 특징과 Support Vector Machine 을 이용한 회전기계 결함 분류)

  • Kim, Yang-Seok;Lee, Do-Hwan;Kim, Seong-Kook
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.34 no.11
    • /
    • pp.1681-1689
    • /
    • 2010
  • Several studies on the use of Support Vector Machines (SVMs) for diagnosing rotating machinery have been successfully carried out, but the fault classification depends on the input features as well as a multi-classification scheme, binary optimizer, kernel function, and the parameter to be used in the kernel function. Most of the published papers on multiclass SVM applications report the use of the same features to classify the faults. In this study, simple statistical features are determined on the basis of time domain vibration signals for various fault conditions, and the optimal features for each fault condition are selected. Then, the optimal features are used in the SVM training and in the classification of each fault condition. Simulation results using experimental data show that the results of the proposed stepwise classification approach with a relatively short training time are comparable to those for a single multi-class SVM.

An Analytical Study on Performance Factors of Automatic Classification based on Machine Learning (기계학습에 기초한 자동분류의 성능 요소에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.33-59
    • /
    • 2016
  • This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (${\beta}$, ${\lambda}$) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.

Cancer Diagnosis System using Genetic Algorithm and Multi-boosting Classifier (Genetic Algorithm과 다중부스팅 Classifier를 이용한 암진단 시스템)

  • Ohn, Syng-Yup;Chi, Seung-Do
    • Journal of the Korea Society for Simulation
    • /
    • v.20 no.2
    • /
    • pp.77-85
    • /
    • 2011
  • It is believed that the anomalies or diseases of human organs are identified by the analysis of the patterns. This paper proposes a new classification technique for the identification of cancer disease using the proteome patterns obtained from two-dimensional polyacrylamide gel electrophoresis(2-D PAGE). In the new classification method, three different classification methods such as support vector machine(SVM), multi-layer perceptron(MLP) and k-nearest neighbor(k-NN) are extended by multi-boosting method in an array of subclassifiers and the results of each subclassifier are merged by ensemble method. Genetic algorithm was applied to obtain optimal feature set in each subclassifier. We applied our method to empirical data set from cancer research and the method showed the better accuracy and more stable performance than single classifier.

CUDA Optimization of Super-Resolution Algorithm using ELBP Classifier (ELBP 분류기를 이용한 초해상도 기법의 CUDA 최적화)

  • Choi, Ji Hoon;Song, Byung Cheol
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.92-94
    • /
    • 2016
  • 저해상도 영상을 고해상도 영상으로 복원하기 위한 다양한 방법의 초해상도 기법이 존재한다. 다양한 기법들 중에서도 ELBP 분류기를 이용한 초해상도 기법[1]은 단일 영상 기반의 초해상도 기법으로 사전에 학습된 필터를 이용하여 고해상도 영상을 획득하는 기법이다. 그러나 해당 알고리즘을 일반적인 CPU 환경에서 수행할 경우 실시간으로 영상을 획득하는데 어려움이 존재한다. 본 논문에서는 지역메모리를 이용한 GPU 환경에서의 최적화를 수행하여 ELBP 분류기를 이용한 초해상도 기법의 가속성을 보인다. 먼저, 알고리즘에 대하여 간단히 설명하고 CUDA 가속화 기법[2]을 차례로 적용했을 때 얻을 수 있는 가속 성능을 확인한다. 최종적으로 본 논문은 CPU 환경과 비교했을 때 5 배의 가속 효과를 얻을 수 있다.

  • PDF