• Title/Summary/Keyword: ECOC

Search Result 10, Processing Time 0.031 seconds

Comparison Study of Multi-class Classification Methods

  • Bae, Wha-Soo;Jeon, Gab-Dong;Seok, Kyung-Ha
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.2
    • /
    • pp.377-388
    • /
    • 2007
  • As one of multi-class classification methods, ECOC (Error Correcting Output Coding) method is known to have low classification error rate. This paper aims at suggesting effective multi-class classification method (1) by comparing various encoding methods and decoding methods in ECOC method and (2) by comparing ECOC method and direct classification method. Both SVM (Support Vector Machine) and logistic regression model were used as binary classifiers in comparison.

Comparison of Various Criteria for Designing ECOC

  • Seok, Kyeong-Ha;Lee, Seung-Chul;Jeon, Gab-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.437-447
    • /
    • 2006
  • Error Correcting Output Coding(ECOC) is used to solve multi-class problem. It is known that it improves the classification accuracy. In this paper, we compared various criteria to design code matrix while encoding. In addition. we prorpose an ensemble which uses the ability of each classifier while decoding. We investigate the justification of the proposed method through real data and synthetic data.

  • PDF

Data-Adaptive ECOC for Multicategory Classification

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.1
    • /
    • pp.25-36
    • /
    • 2008
  • Error Correcting Output Codes (ECOC) can improve generalization performance when applied to multicategory classification problem. In this study we propose a new criterion to select hyperparameters included in ECOC scheme. Instead of margins of a data we propose to use the probability of misclassification error since it makes the criterion simple. Using this we obtain an upper bound of leave-one-out error of OVA(one vs all) method. Our experiments from real and synthetic data indicate that the bound leads to good estimates of parameters.

  • PDF

Hyperparameter Selection for APC-ECOC

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1219-1231
    • /
    • 2008
  • The main object of this paper is to develop a leave-one-out(LOO) bound of all pairwise comparison error correcting output codes (APC-ECOC). To avoid using classifiers whose corresponding target values are 0 in APC-ECOC and requiring pilot estimates we developed a bound based on mean misclassification probability(MMP). It can be used to tune kernel hyperparameters. Our empirical experiment using kernel mean squared estimate(KMSE) as the binary classifier indicates that the bound leads to good estimates of kernel hyperparameters.

  • PDF

Solving Multi-class Problem using Support Vector Machines (Support Vector Machines을 이용한 다중 클래스 문제 해결)

  • Ko, Jae-Pil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.12
    • /
    • pp.1260-1270
    • /
    • 2005
  • Support Vector Machines (SVM) is well known for a representative learner as one of the kernel methods. SVM which is based on the statistical learning theory shows good generalization performance and has been applied to various pattern recognition problems. However, SVM is basically to deal with a two-class classification problem, so we cannot solve directly a multi-class problem with a binary SVM. One-Per-Class (OPC) and All-Pairs have been applied to solve the face recognition problem, which is one of the multi-class problems, with SVM. The two methods above are ones of the output coding methods, a general approach for solving multi-class problem with multiple binary classifiers, which decomposes a complex multi-class problem into a set of binary problems and then reconstructs the outputs of binary classifiers for each binary problem. In this paper, we introduce the output coding methods as an approach for extending binary SVM to multi-class SVM and propose new output coding schemes based on the Error-Correcting Output Codes (ECOC) which is a dominant theoretical foundation of the output coding methods. From the experiment on the face recognition, we give empirical results on the properties of output coding methods including our proposed ones.

Support vector ensemble for incipient fault diagnosis in nuclear plant components

  • Ayodeji, Abiodun;Liu, Yong-kuo
    • Nuclear Engineering and Technology
    • /
    • v.50 no.8
    • /
    • pp.1306-1313
    • /
    • 2018
  • The randomness and incipient nature of certain faults in reactor systems warrant a robust and dynamic detection mechanism. Existing models and methods for fault diagnosis using different mathematical/statistical inferences lack incipient and novel faults detection capability. To this end, we propose a fault diagnosis method that utilizes the flexibility of data-driven Support Vector Machine (SVM) for component-level fault diagnosis. The technique integrates separately-built, separately-trained, specialized SVM modules capable of component-level fault diagnosis into a coherent intelligent system, with each SVM module monitoring sub-units of the reactor coolant system. To evaluate the model, marginal faults selected from the failure mode and effect analysis (FMEA) are simulated in the steam generator and pressure boundary of the Chinese CNP300 PWR (Qinshan I NPP) reactor coolant system, using a best-estimate thermal-hydraulic code, RELAP5/SCDAP Mod4.0. Multiclass SVM model is trained with component level parameters that represent the steady state and selected faults in the components. For optimization purposes, we considered and compared the performances of different multiclass models in MATLAB, using different coding matrices, as well as different kernel functions on the representative data derived from the simulation of Qinshan I NPP. An optimum predictive model - the Error Correcting Output Code (ECOC) with TenaryComplete coding matrix - was obtained from experiments, and utilized to diagnose the incipient faults. Some of the important diagnostic results and heuristic model evaluation methods are presented in this paper.

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

Analytical Methods of Levoglucosan, a Tracer for Cellulose in Biomass Burning, by Four Different Techniques

  • Bae, Min-Suk;Lee, Ji-Yi;Kim, Yong-Pyo;Oak, Min-Ho;Shin, Ju-Seon;Lee, Kwang-Yul;Lee, Hyun-Hee;Lee, Sun-Young;Kim, Young-Joon
    • Asian Journal of Atmospheric Environment
    • /
    • v.6 no.1
    • /
    • pp.53-66
    • /
    • 2012
  • A comparison of analytical approaches for Levoglucosan ($C_6H_{10}O_5$, commonly formed from the pyrolysis of carbohydrates such as cellulose) and used for a molecular marker in biomass burning is made between the four different analytical systems. 1) Spectrothermography technique as the evaluation of thermograms of carbon using Elemental Carbon & Organic Carbon Analyzer, 2) mass spectrometry technique using Gas Chromatography/mass spectrometer (GC/MS), 3) Aerosol Mass Spectrometer (AMS) for the identification of the particle size distribution and chemical composition, and 4) two dimensional Gas Chromatography with Time of Flight mass spectrometry (GC${\times}$GC-TOFMS) for defining the signature of Levoglucosan in terms of chemical analytical process. First, a Spectrothermography, which is defined as the graphical representation of the carbon, can be measured as a function of temperature during the thermal separation process and spectrothermographic analysis. GC/MS can detect mass fragment ions of Levoglucosan characterized by its base peak at m/z 60, 73 in mass fragment-grams by methylation and m/z 217, 204 by trimethylsilylderivatives (TMS-derivatives). AMS can be used to analyze the base peak at m/z 60.021, 73.029 in mass fragment-grams with a multiple-peak Gaussian curve fit algorithm. In the analysis of TMS derivatives by GC${\times}$GC-TOFMS, it can detect m/z 73 as the base ion for the identification of Levoglucosan. It can also observe m/z 217 and 204 with existence of m/z 333. Although the ratios of m/z 217 and m/z 204 to the base ion (m/z 73) in the mass spectrum of GC${\times}$GC-TOFMS lower than those of GC/MS, Levoglucosan can be separated and characterized from D (-) +Ribose in the mixture of sugar compounds. At last, the environmental significance of Levoglucosan will be discussed with respect to the health effect to offer important opportunities for clinical and potential epidemiological research for reducing incidence of cardiovascular and respiratory diseases.

A Study on the Direction of Cultural City Designation Project in the Case of European Capitals of Culture (유럽문화수도 사례로 본 문화도시 지정사업의 방향성 고찰)

  • Kim, Sun Young;Yi, Eui Shin
    • Korean Association of Arts Management
    • /
    • no.52
    • /
    • pp.135-156
    • /
    • 2019
  • The purpose of this study is to derive more practical and concrete policy implications for the successful implementation of the Cultural City Designation Project, which is emerging as a main topic of cultural policy. To this end, the background and implementation system of the European Capitals of Culture(ECOC), which is the subject of benchmarking in various aspects, were examined. As a result, it was confirmed that there is a possibility that the Cultural City Designation Project can reveal its limitations in the background and process, and the improvement is as follows. First, rather than creating an ideal cultural city model to achieve its goals in a short period of time, efforts should be made to secure diversity and expand insufficient infrastructure in accordance with local autonomous decisions. Second, in order to secure the continuity of the business, it is necessary to secure and educate professional manpower for organizational operation in the form of independent or direct agency of each local government. Finally, careful policy consideration should be made at the national level to balance regional interests. Therefore, there is a need for an organized 'government-level organization' that can take on the role of the city selection process, support system, and ex post evaluation. In short, successful cultural city projects require critical acceptance and efforts to remedy fundamental problems rather than benchmarking unconditional overseas cases in terms of cultural policy.