• 제목/요약/키워드: One-class Classification

검색결과 353건 처리시간 0.026초

다분류 SVM을 이용한 DEA기반 벤처기업 효율성등급 예측모형 (The Prediction of DEA based Efficiency Rating for Venture Business Using Multi-class SVM)

  • 박지영;홍태호
    • Asia pacific journal of information systems
    • /
    • 제19권2호
    • /
    • pp.139-155
    • /
    • 2009
  • For the last few decades, many studies have tried to explore and unveil venture companies' success factors and unique features in order to identify the sources of such companies' competitive advantages over their rivals. Such venture companies have shown tendency to give high returns for investors generally making the best use of information technology. For this reason, many venture companies are keen on attracting avid investors' attention. Investors generally make their investment decisions by carefully examining the evaluation criteria of the alternatives. To them, credit rating information provided by international rating agencies, such as Standard and Poor's, Moody's and Fitch is crucial source as to such pivotal concerns as companies stability, growth, and risk status. But these types of information are generated only for the companies issuing corporate bonds, not venture companies. Therefore, this study proposes a method for evaluating venture businesses by presenting our recent empirical results using financial data of Korean venture companies listed on KOSDAQ in Korea exchange. In addition, this paper used multi-class SVM for the prediction of DEA-based efficiency rating for venture businesses, which was derived from our proposed method. Our approach sheds light on ways to locate efficient companies generating high level of profits. Above all, in determining effective ways to evaluate a venture firm's efficiency, it is important to understand the major contributing factors of such efficiency. Therefore, this paper is constructed on the basis of following two ideas to classify which companies are more efficient venture companies: i) making DEA based multi-class rating for sample companies and ii) developing multi-class SVM-based efficiency prediction model for classifying all companies. First, the Data Envelopment Analysis(DEA) is a non-parametric multiple input-output efficiency technique that measures the relative efficiency of decision making units(DMUs) using a linear programming based model. It is non-parametric because it requires no assumption on the shape or parameters of the underlying production function. DEA has been already widely applied for evaluating the relative efficiency of DMUs. Recently, a number of DEA based studies have evaluated the efficiency of various types of companies, such as internet companies and venture companies. It has been also applied to corporate credit ratings. In this study we utilized DEA for sorting venture companies by efficiency based ratings. The Support Vector Machine(SVM), on the other hand, is a popular technique for solving data classification problems. In this paper, we employed SVM to classify the efficiency ratings in IT venture companies according to the results of DEA. The SVM method was first developed by Vapnik (1995). As one of many machine learning techniques, SVM is based on a statistical theory. Thus far, the method has shown good performances especially in generalizing capacity in classification tasks, resulting in numerous applications in many areas of business, SVM is basically the algorithm that finds the maximum margin hyperplane, which is the maximum separation between classes. According to this method, support vectors are the closest to the maximum margin hyperplane. If it is impossible to classify, we can use the kernel function. In the case of nonlinear class boundaries, we can transform the inputs into a high-dimensional feature space, This is the original input space and is mapped into a high-dimensional dot-product space. Many studies applied SVM to the prediction of bankruptcy, the forecast a financial time series, and the problem of estimating credit rating, In this study we employed SVM for developing data mining-based efficiency prediction model. We used the Gaussian radial function as a kernel function of SVM. In multi-class SVM, we adopted one-against-one approach between binary classification method and two all-together methods, proposed by Weston and Watkins(1999) and Crammer and Singer(2000), respectively. In this research, we used corporate information of 154 companies listed on KOSDAQ market in Korea exchange. We obtained companies' financial information of 2005 from the KIS(Korea Information Service, Inc.). Using this data, we made multi-class rating with DEA efficiency and built multi-class prediction model based data mining. Among three manners of multi-classification, the hit ratio of the Weston and Watkins method is the best in the test data set. In multi classification problems as efficiency ratings of venture business, it is very useful for investors to know the class with errors, one class difference, when it is difficult to find out the accurate class in the actual market. So we presented accuracy results within 1-class errors, and the Weston and Watkins method showed 85.7% accuracy in our test samples. We conclude that the DEA based multi-class approach in venture business generates more information than the binary classification problem, notwithstanding its efficiency level. We believe this model can help investors in decision making as it provides a reliably tool to evaluate venture companies in the financial domain. For the future research, we perceive the need to enhance such areas as the variable selection process, the parameter selection of kernel function, the generalization, and the sample size of multi-class.

Topic Classification for Suicidology

  • Read, Jonathon;Velldal, Erik;Ovrelid, Lilja
    • Journal of Computing Science and Engineering
    • /
    • 제6권2호
    • /
    • pp.143-150
    • /
    • 2012
  • Computational techniques for topic classification can support qualitative research by automatically applying labels in preparation for qualitative analyses. This paper presents an evaluation of supervised learning techniques applied to one such use case, namely, that of labeling emotions, instructions and information in suicide notes. We train a collection of one-versus-all binary support vector machine classifiers, using cost-sensitive learning to deal with class imbalance. The features investigated range from a simple bag-of-words and n-grams over stems, to information drawn from syntactic dependency analysis and WordNet synonym sets. The experimental results are complemented by an analysis of systematic errors in both the output of our system and the gold-standard annotations.

2차 하수를 이용한 비 선형 패턴인식 알고리즘 구축 (Construction of A Nonlinear Classification Algorithm Using Quadratic Functions)

  • 김락상
    • 한국경영과학회지
    • /
    • 제25권4호
    • /
    • pp.55-65
    • /
    • 2000
  • This paper presents a linear programming based algorithm for pattern classification. Pattern classification is being considered to be critical in the area of artificial intelligence and business applications. Previous methods employing linear programming have been aimed at two-group discrimination with one or more linear discriminant functions. Therefore, there are some limitations in applying available linear programming formulations directly to general multi-class classification problems. The algorithm proposed in this manuscript is based on quadratic or polynomial discriminant functions, which allow more flexibility in covering the class regions in the N-dimensional space. The proposed algorithm is compared with other competitive methods of pattern classification in experimental results and is shown to be competitive enough for a general purpose classifier.

  • PDF

차분진화 알고리즘을 이용한 Nearest Prototype Classifier 설계 (Design of Nearest Prototype Classifier by using Differential Evolutionary Algorithm)

  • 노석범;안태천
    • 한국지능시스템학회논문지
    • /
    • 제21권4호
    • /
    • pp.487-492
    • /
    • 2011
  • 본 논문에서는 가장 단순한 구조를 가진 Nearest Prototype Classifier의 성능 개선을 위해 차분 진화 알고리즘을 적용하여 prototype의 위치를 결정하는 방법을 제안하였다. 차분 진화 알고리즘을 이용하여 prototype의 위치 벡터가 결정이 되며, 차분 진화 알고리즘에 의해 결정된 prototype의 class label을 결정하기 위한 class label 결정 알고리즘도 제안하였다. 제안된 알고리즘의 성능 평가를 위해 기존의 패턴 분류기와 비교 결과를 보인다.

RBF 커널과 다중 클래스 SVM을 이용한 생리적 반응 기반 감정 인식 기술 (Physiological Responses-Based Emotion Recognition Using Multi-Class SVM with RBF Kernel)

  • 마카라 완니;고광은;박승민;심귀보
    • 제어로봇시스템학회논문지
    • /
    • 제19권4호
    • /
    • pp.364-371
    • /
    • 2013
  • Emotion Recognition is one of the important part to develop in human-human and human computer interaction. In this paper, we have focused on the performance of multi-class SVM (Support Vector Machine) with Gaussian RFB (Radial Basis function) kernel, which has been used to solve the problem of emotion recognition from physiological signals and to improve the accuracy of emotion recognition. The experimental paradigm for data acquisition, visual-stimuli of IAPS (International Affective Picture System) are used to induce emotional states, such as fear, disgust, joy, and neutral for each subject. The raw signals of acquisited data are splitted in the trial from each session to pre-process the data. The mean value and standard deviation are employed to extract the data for feature extraction and preparing in the next step of classification. The experimental results are proving that the proposed approach of multi-class SVM with Gaussian RBF kernel with OVO (One-Versus-One) method provided the successful performance, accuracies of classification, which has been performed over these four emotions.

하이퍼스펙트럴 영상의 분류 기법 비교 (A Comparison of Classification Techniques in Hyperspectral Image)

  • 가칠오;김대성;변영기;김용일
    • 한국측량학회:학술대회논문집
    • /
    • 한국측량학회 2004년도 추계학술발표회 논문집
    • /
    • pp.251-256
    • /
    • 2004
  • The image classification is one of the most important studies in the remote sensing. In general, the MLC(Maximum Likelihood Classification) classification that in consideration of distribution of training information is the most effective way but it produces a bad result when we apply it to actual hyperspectral image with the same classification technique. The purpose of this research is to reveal that which one is the most effective and suitable way of the classification algorithms iii the hyperspectral image classification. To confirm this matter, we apply the MLC classification algorithm which has distribution information and SAM(Spectral Angle Mapper), SFF(Spectral Feature Fitting) algorithm which use average information of the training class to both multispectral image and hyperspectral image. I conclude this result through quantitative and visual analysis using confusion matrix could confirm that SAM and SFF algorithm using of spectral pattern in vector domain is more effective way in the hyperspectral image classification than MLC which considered distribution.

  • PDF

Vibration Anomaly Detection of One-Class Classification using Multi-Column AutoEncoder

  • Sang-Min, Kim;Jung-Mo, Sohn
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권2호
    • /
    • pp.9-17
    • /
    • 2023
  • 본 논문에서는 베어링의 결함 진단을 위한 단일 클래스 분류의 진동 이상 탐지 시스템을 제안한다. 베어링 고장으로 인해 발생하는 경제적 및 시간적 손실을 줄이기 위해 정확한 결함 진단시스템은 필수적이며 문제 해결을 위해 딥러닝 기반의 결함 진단 시스템들이 널리 연구되고 있다. 그러나 딥러닝 학습을 위한 실제 데이터 채집 환경에서 비정상 데이터 확보에 어려움이 있으며 이는 데이터 편향을 초래한다. 이에 정상 데이터만 활용하는 단일 클래스 분류 방법을 활용한다. 일반적인 방법으로는 AutoEncoder를 통한 압축과 복원 과정을 학습하여 진동 데이터의 특성을 추출한다. 추출된 특성으로 단일 클래스 분류기를 학습하여 이상 탐지를 실시한다. 하지만 이와 같은 방법은 진동 데이터의 주파수 특성을 고려하지 않아서 진동 데이터의 특성을 효율적 추출할 수 없다. 이러한 문제를 해결하기 위해 진동 데이터의 주파수 특성을 고려한 AutoEncoder 모델을 제안한다. 분류 성능은 accuracy 0.910, precision 1.0, recall 0.820, f1-score 0.901이 나왔다. 주파수 특성을 고려한 네트워크 설계로 기존 방법들보다 우수한 성능을 확인하였다.

나이브 베이지안 분류기를 이용한 판소리 분류 프로그램 구현 (An Implementation of Pan-So-Ri Classification Program Using Naive Bayesian Classifier)

  • 김원종;이강복;김명관
    • 한국인터넷방송통신학회논문지
    • /
    • 제11권3호
    • /
    • pp.153-159
    • /
    • 2011
  • 판소리는 이야기를 노래로 부르는 우리나라의 전통음악 형식 중 하나로 두 가지 유파(동편제, 서편제)로 나누어진다. 판소리에 대한 지식이 없는 사람은 판소리를 듣고서 이 두 가지 유파를 구별해내기 어렵다. 본 논문에서는 PCD(Pitch Class Distribution)와 나이브 베이지안 분류기를 이용한 판소리 분류 프로그램 구현 과정을 기술한다. 분류기에 사용되는 속성값으로는 각 음계의 출현빈도를 이용하였다. 실험은 확률값을 반올림한 위치를 다르게 하여 두 번 실행하였으며, 그 중 보다 뛰어난 결과로 동편제를 80%, 서편제를 97%, 총 88%의 정확도로 올바르게 분류해 내는 것을 알 수 있었다. 구현한 프로그램에는 이 결과를 적용하였다.

Pharmaceutical Usefulness of Biopharmaceutics Classification System: Overview and New Trend

  • Youn, Yu-Seok;Lee, Ju-Ho;Jeong, Seong-Hoon;Shin, Beom-Soo;Park, Eun-Seok
    • Journal of Pharmaceutical Investigation
    • /
    • 제40권spc호
    • /
    • pp.1-7
    • /
    • 2010
  • Since the introduction of the biopharmaceutics classification system (BCS) in 1995, it has viewed as an effective tool to categorize drugs in terms of prediction for bioavailability (BA) and bioequivalence (BE). The BCS consist of four drug categories: class I (highly soluble and highly permeable), class II (low soluble and highly permeable), class III (highly soluble and low permeable) and class IV (low soluble and low permeable), and almost all drugs belong to one of these categories. Likewise, classifying drugs into four categories according to their solubility and permeability is simple and relatively not controversial, and thus the FDA adopted the BCS as a science-based approach in establishing a series of regulatory guidance for the industry. Actually, many pharmaceutical companies have gained a lot of benefits, which directly connect to cost loss and failure decrease in the early stage of drug development. Recently, instead of solubility, using dissolution characteristics (e.g. intrinsic dissolution rate) have provided an improvement in the classification in correlating more closely with in vivo drug dissolution rather than solubility by itself. Furthermore, a newly modified-version of BCS, biopharmaceutics drug disposition classification system (BDDCS), which classify drugs into four categories according to solubility and metabolism, has been introduced and gained much attention as a new insight in respect with the drug classification. This report gives a brief overview of the BCS and its implication, and also introduces the recent new trend of drug classification.

SUPPORT Applications for Classification Trees

  • Lee, Sang-Bock;Park, Sun-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권3호
    • /
    • pp.565-574
    • /
    • 2004
  • Classification tree algorithms including as CART by Brieman et al.(1984) in some aspects, recursively partition the data space with the aim of making the distribution of the class variable as pure as within each partition and consist of several steps. SUPPORT(smoothed and unsmoothed piecewise-polynomial regression trees) method of Chaudhuri et al(1994), a weighted averaging technique is used to combine piecewise polynomial fits into a smooth one. We focus on applying SUPPORT to a binary class variable. Logistic model is considered in the caculation techniques and the results are shown good classification rates compared with other methods as CART, QUEST, and CHAID.

  • PDF