• 제목/요약/키워드: Multi-class Classification

검색결과 218건 처리시간 0.033초

A Hierarchical Clustering Method Based on SVM for Real-time Gas Mixture Classification

  • Kim, Guk-Hee;Kim, Young-Wung;Lee, Sang-Jin;Jeon, Gi-Joon
    • 한국지능시스템학회논문지
    • /
    • 제20권5호
    • /
    • pp.716-721
    • /
    • 2010
  • In this work we address the use of support vector machine (SVM) in the multi-class gas classification system. The objective is to classify single gases and their mixture with a semiconductor-type electronic nose. The SVM has some typical multi-class classification models; One vs. One (OVO) and One vs. All (OVA). However, studies on those models show weaknesses on calculation time, decision time and the reject region. We propose a hierarchical clustering method (HCM) based on the SVM for real-time gas mixture classification. Experimental results show that the proposed method has better performance than the typical multi-class systems based on the SVM, and that the proposed method can classify single gases and their mixture easily and fast in the embedded system compared with BP-MLP and Fuzzy ARTMAP.

지지벡터기계를 이용한 다중 분류 문제의 학습과 성능 비교 (Learning and Performance Comparison of Multi-class Classification Problems based on Support Vector Machine)

  • 황두성
    • 한국멀티미디어학회논문지
    • /
    • 제11권7호
    • /
    • pp.1035-1042
    • /
    • 2008
  • 이진 분류기로서 지지벡터기계는 다양한 응용을 통해 이진 분류 문제에서 기존의 패턴 분류기들보다 우수한 성능을 보였다. 지지벡터기계의 바탕이 되는 최대 마진 분류 이론을 다중 분류 문제에 확장은 어려움이 있다. 이 논문에서는 다중 분류 문제를 위한 지지벡터기계의 학습 전략을 논의하였으며 성능 비교를 수행하였다. 학습 데이터의 분배 전략에 따라 지지벡터기계는 고유의 이진 분류 특징을 수정하지 않고 다중분류 문제에 쉴게 적용될 수 있다. 다양한 벤치마킹 데이터에 대해 선택된 학습 전략, 커널함수, 학습 소요시간 등에 따라 성능비교가 수행되었고 오류역전파 학습의 신경망의 테스트 결과와 비교되었다. 신경망 모델과 비교 실험에서 지지벡터기계는 일반적인 다중 분류 문제에 응용성과 효과가 있음을 보였다.

  • PDF

다중 클래스 SVMs를 이용한 얼굴 인식의 성능 개선 (The Performance Improvement of Face Recognition Using Multi-Class SVMs)

  • 박성욱;박종욱
    • 대한전자공학회논문지SP
    • /
    • 제41권6호
    • /
    • pp.43-49
    • /
    • 2004
  • 기존의 다중 클래스 SVMs은 클래스의 개수가 증가되면, 이진 클래스 SVMs의 수도 증가되어 분류를 위해 많은 시간이 요구된다. 본 논문에서는 분류 시간을 줄이기 위하여, PCA+LDA 특징 부 공간에서 NNR을 적용하여 클래스의 개수를 줄이는 방법을 제안한다. 제안된 방법은 PCA+LDA 특징 부 공간에서 간단한 NNR을 사용하여, 입력된 테스트 특징 데이터와 근접된 얼굴 클래스들을 추출함으로서 얼굴 클래스의 개수를 줄이는 방법이다. 클래스 개수를 줄임으로, 본 방법은 기존의 다중 클래스 SVMs에 비하여 훈련 횟수와 비교 횟수를 줄일 수 있고, 결과적으로 하나의 테스트 영상을 위한 분류 시간을 크게 줄일 수 있다. 또한 실험 결과, 제안된 방법은 NNC 기법보다 낮은 에러 율을 가지며, 기존의 다중 클래스 SVMs보다 동일한 에러 율을 갖지만, 보다 빠른 분류시간을 가짐을 확인할 수 있었다.

다분류 SVM을 이용한 DEA기반 벤처기업 효율성등급 예측모형 (The Prediction of DEA based Efficiency Rating for Venture Business Using Multi-class SVM)

  • 박지영;홍태호
    • Asia pacific journal of information systems
    • /
    • 제19권2호
    • /
    • pp.139-155
    • /
    • 2009
  • For the last few decades, many studies have tried to explore and unveil venture companies' success factors and unique features in order to identify the sources of such companies' competitive advantages over their rivals. Such venture companies have shown tendency to give high returns for investors generally making the best use of information technology. For this reason, many venture companies are keen on attracting avid investors' attention. Investors generally make their investment decisions by carefully examining the evaluation criteria of the alternatives. To them, credit rating information provided by international rating agencies, such as Standard and Poor's, Moody's and Fitch is crucial source as to such pivotal concerns as companies stability, growth, and risk status. But these types of information are generated only for the companies issuing corporate bonds, not venture companies. Therefore, this study proposes a method for evaluating venture businesses by presenting our recent empirical results using financial data of Korean venture companies listed on KOSDAQ in Korea exchange. In addition, this paper used multi-class SVM for the prediction of DEA-based efficiency rating for venture businesses, which was derived from our proposed method. Our approach sheds light on ways to locate efficient companies generating high level of profits. Above all, in determining effective ways to evaluate a venture firm's efficiency, it is important to understand the major contributing factors of such efficiency. Therefore, this paper is constructed on the basis of following two ideas to classify which companies are more efficient venture companies: i) making DEA based multi-class rating for sample companies and ii) developing multi-class SVM-based efficiency prediction model for classifying all companies. First, the Data Envelopment Analysis(DEA) is a non-parametric multiple input-output efficiency technique that measures the relative efficiency of decision making units(DMUs) using a linear programming based model. It is non-parametric because it requires no assumption on the shape or parameters of the underlying production function. DEA has been already widely applied for evaluating the relative efficiency of DMUs. Recently, a number of DEA based studies have evaluated the efficiency of various types of companies, such as internet companies and venture companies. It has been also applied to corporate credit ratings. In this study we utilized DEA for sorting venture companies by efficiency based ratings. The Support Vector Machine(SVM), on the other hand, is a popular technique for solving data classification problems. In this paper, we employed SVM to classify the efficiency ratings in IT venture companies according to the results of DEA. The SVM method was first developed by Vapnik (1995). As one of many machine learning techniques, SVM is based on a statistical theory. Thus far, the method has shown good performances especially in generalizing capacity in classification tasks, resulting in numerous applications in many areas of business, SVM is basically the algorithm that finds the maximum margin hyperplane, which is the maximum separation between classes. According to this method, support vectors are the closest to the maximum margin hyperplane. If it is impossible to classify, we can use the kernel function. In the case of nonlinear class boundaries, we can transform the inputs into a high-dimensional feature space, This is the original input space and is mapped into a high-dimensional dot-product space. Many studies applied SVM to the prediction of bankruptcy, the forecast a financial time series, and the problem of estimating credit rating, In this study we employed SVM for developing data mining-based efficiency prediction model. We used the Gaussian radial function as a kernel function of SVM. In multi-class SVM, we adopted one-against-one approach between binary classification method and two all-together methods, proposed by Weston and Watkins(1999) and Crammer and Singer(2000), respectively. In this research, we used corporate information of 154 companies listed on KOSDAQ market in Korea exchange. We obtained companies' financial information of 2005 from the KIS(Korea Information Service, Inc.). Using this data, we made multi-class rating with DEA efficiency and built multi-class prediction model based data mining. Among three manners of multi-classification, the hit ratio of the Weston and Watkins method is the best in the test data set. In multi classification problems as efficiency ratings of venture business, it is very useful for investors to know the class with errors, one class difference, when it is difficult to find out the accurate class in the actual market. So we presented accuracy results within 1-class errors, and the Weston and Watkins method showed 85.7% accuracy in our test samples. We conclude that the DEA based multi-class approach in venture business generates more information than the binary classification problem, notwithstanding its efficiency level. We believe this model can help investors in decision making as it provides a reliably tool to evaluate venture companies in the financial domain. For the future research, we perceive the need to enhance such areas as the variable selection process, the parameter selection of kernel function, the generalization, and the sample size of multi-class.

mRMR과 수정된 입자군집화 방법을 이용한 다범주 분류를 위한 최적유전자집단 구성 (A hybrid method to compose an optimal gene set for multi-class classification using mRMR and modified particle swarm optimization)

  • 이선호
    • 응용통계연구
    • /
    • 제33권6호
    • /
    • pp.683-696
    • /
    • 2020
  • 표본의 다범주 표현형을 예측하는데 사용되는 최적의 유전자집단이란 적은 수의 유전자로 표현형을 정확히 예측할 수 있는 유전자들의 모임이다. 특이발현유전자를 검색하는 통계량은 이미 여러 가지가 있고, K-평균 군집화를 곁들여 중복성이 적은 특이발현유전자들을 선택 가능하다. 이들을 바탕으로 적은 수로 정확하게 다범주 분류가 가능한 유전자집단을 구성할 수 있도록 수정한 입자최적화 방법을 제안한다. 널리 알려진 ALL 248례와 SRBCT 83례를 이용하여 제안된 방법으로 최적유전자집단을 찾을 수 있음을 보였다.

온라인 리뷰에서 평점의 분류 (Classification of ratings in online reviews)

  • 최동준;최호식;박창이
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권4호
    • /
    • pp.845-854
    • /
    • 2016
  • 감성분석 (sentiment analysis) 혹은 오피니언 마이닝 (opinion mining)은 블로그, 리뷰, 신문기사나 소셜네트워크 등의 문서에서 개인의 주관적인 정보 혹은 의견을 알아보는데 사용되는 텍스트 마이닝의 기법이다. 평점이 있는 온라인 리뷰에서 리뷰 텍스트에 기반한 평점의 분류문제에 대한 선행연구에서는 이진 분류만을 고려하였다. 그러나 긍정과 부정 외에도 중립적인 의견도 있을 수 있기 때문에 이진 분류보다는 다범주 분류가 더 적합할 것이다. 본 연구에서는 리뷰 텍스트에 기반한 평점의 다범주 분류문제를 고려한다. 전처리에서는 카이제곱 통계량을 이용하여 평점과 연관된 단어들을 추출하고 이를 입력변수로 삼아 지지벡터기계 (support vector machines)와 비례오즈 모형 (proportional odds model) 등 다범주 분류기의 예측력을 비교한다.

Multi-Class SVM+MTL for the Prediction of Corporate Credit Rating with Structured Data

  • Ren, Gang;Hong, Taeho;Park, YoungKi
    • Asia pacific journal of information systems
    • /
    • 제25권3호
    • /
    • pp.579-596
    • /
    • 2015
  • Many studies have focused on the prediction of corporate credit rating using various data mining techniques. One of the most frequently used algorithms is support vector machines (SVM), and recently, novel techniques such as SVM+ and SVM+MTL have emerged. This paper intends to show the applicability of such new techniques to multi-classification and corporate credit rating and compare them with conventional SVM regarding prediction performance. We solve multi-class SVM+ and SVM+MTL problems by constructing several binary classifiers. Furthermore, to demonstrate the robustness and outstanding performance of SVM+MTL algorithm over other techniques, we utilized four typical multi-class processing methods in our experiments. The results show that SVM+MTL outperforms both conventional SVM and novel SVM+ in predicting corporate credit rating. This study contributes to the literature by showing the applicability of new techniques such as SVM+ and SVM+MTL and the outperformance of SVM+MTL over conventional techniques. Thus, this study enriches solving techniques for addressing multi-class problems such as corporate credit rating prediction.

가우시안 기반 Hyper-Rectangle 생성을 이용한 효율적 단일 분류기 (An Efficient One Class Classifier Using Gaussian-based Hyper-Rectangle Generation)

  • 김도균;최진영;고정한
    • 산업경영시스템학회지
    • /
    • 제41권2호
    • /
    • pp.56-64
    • /
    • 2018
  • In recent years, imbalanced data is one of the most important and frequent issue for quality control in industrial field. As an example, defect rate has been drastically reduced thanks to highly developed technology and quality management, so that only few defective data can be obtained from production process. Therefore, quality classification should be performed under the condition that one class (defective dataset) is even smaller than the other class (good dataset). However, traditional multi-class classification methods are not appropriate to deal with such an imbalanced dataset, since they classify data from the difference between one class and the others that can hardly be found in imbalanced datasets. Thus, one-class classification that thoroughly learns patterns of target class is more suitable for imbalanced dataset since it only focuses on data in a target class. So far, several one-class classification methods such as one-class support vector machine, neural network and decision tree there have been suggested. One-class support vector machine and neural network can guarantee good classification rate, and decision tree can provide a set of rules that can be clearly interpreted. However, the classifiers obtained from the former two methods consist of complex mathematical functions and cannot be easily understood by users. In case of decision tree, the criterion for rule generation is ambiguous. Therefore, as an alternative, a new one-class classifier using hyper-rectangles was proposed, which performs precise classification compared to other methods and generates rules clearly understood by users as well. In this paper, we suggest an approach for improving the limitations of those previous one-class classification algorithms. Specifically, the suggested approach produces more improved one-class classifier using hyper-rectangles generated by using Gaussian function. The performance of the suggested algorithm is verified by a numerical experiment, which uses several datasets in UCI machine learning repository.

Optimal Solution of Classification (Prediction) Problem

  • Mohammad S. Khrisat
    • International Journal of Computer Science & Network Security
    • /
    • 제23권9호
    • /
    • pp.129-133
    • /
    • 2023
  • Classification or prediction problem is how to solve it using a specific feature to obtain the predicted class. A wheat seeds specifications 4 3 classes of seeds will be used in a prediction process. A multi linear regression will be built, and a prediction error ratio will be calculated. To enhance the prediction ratio an ANN model will be built and trained. The obtained results will be examined to show how to make a prediction tool capable to compute a predicted class number very close to the target class number.

레이블 매핑을 이용한 다중 이미지 분류 (Multiple image classification using label mapping)

  • 전승제;이동준;이동휘
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 춘계학술대회
    • /
    • pp.367-369
    • /
    • 2022
  • 본 논문에서는 훈련된 모델이 분류에 실패한 이미지들에 대한 정확한 결과를 확인하기 위해 다중 클래스의 이미지 분류를 구현하면서 각각의 클래스에 맞게 레이블 매핑을 하여 예측 결과를 확인했다. Kaggle의 Intel Image Classification 데이터셋을 사용하여 CNN 모델을 구축하고 훈련을 진행하였으며, 테스트 데이터셋의 이미지들을 레이블 매핑을 통해 다중 클래스의 이미지들이 매핑된 레이블 값과 모델이 분류한 값을 비교하였다.

  • PDF