• Title/Summary/Keyword: Classification Problem

Search Result 1,734, Processing Time 0.027 seconds

Corporate Credit Rating using Partitioned Neural Network and Case- Based Reasoning (신경망 분리모형과 사례기반추론을 이용한 기업 신용 평가)

  • Kim, David;Han, In-Goo;Min, Sung-Hwan
    • Journal of Information Technology Applications and Management
    • /
    • v.14 no.2
    • /
    • pp.151-168
    • /
    • 2007
  • The corporate credit rating represents an assessment of the relative level of risk associated with the timely payments required by the debt obligation. In this study, the corporate credit rating model employs artificial intelligence methods including Neural Network (NN) and Case-Based Reasoning (CBR). At first we suggest three classification models, as partitioned neural networks, all of which convert multi-group classification problems into two group classification ones: Ordinal Pairwise Partitioning (OPP) model, binary classification model and simple classification model. The experimental results show that the partitioned NN outperformed the conventional NN. In addition, we put to use CBR that is widely used recently as a problem-solving and learning tool both in academic and business areas. With an advantage of the easiness in model design compared to a NN model, the CBR model proves itself to have good classification capability through the highest hit ratio in the corporate credit rating.

  • PDF

Fast Pattern Classification with the Multi-layer Cellular Nonlinear Networks (CNN) (다층 셀룰라 비선형 회로망(CNN)을 이용한 고속 패턴 분류)

  • 오태완;이혜정;손홍락;김형석
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.52 no.9
    • /
    • pp.540-546
    • /
    • 2003
  • A fast pattern classification algorithm with Cellular Nonlinear Network-based dynamic programming is proposed. The Cellular Nonlinear Networks is an analog parallel processing architecture and the dynamic programing is an efficient computation algorithm for optimization problem. Combining merits of these two technologies, fast pattern classification with optimization is formed. On such CNN-based dynamic programming, if exemplars and test patterns are presented as the goals and the start positions, respectively, the optimal paths from test patterns to their closest exemplars are found. Such paths are utilized as aggregating keys for the classification. The algorithm is similar to the conventional neural network-based method in the use of the exemplar patterns but quite different in the use of the most likely path finding of the dynamic programming. The pattern classification is performed well regardless of degree of the nonlinearity in class borders.

Fuzzy Classification Rule Learning by Decision Tree Induction

  • Lee, Keon-Myung;Kim, Hak-Joon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.3 no.1
    • /
    • pp.44-51
    • /
    • 2003
  • Knowledge acquisition is a bottleneck in knowledge-based system implementation. Decision tree induction is a useful machine learning approach for extracting classification knowledge from a set of training examples. Many real-world data contain fuzziness due to observation error, uncertainty, subjective judgement, and so on. To cope with this problem of real-world data, there have been some works on fuzzy classification rule learning. This paper makes a survey for the kinds of fuzzy classification rules. In addition, it presents a fuzzy classification rule learning method based on decision tree induction, and shows some experiment results for the method.

An Optimal Weighting Method in Supervised Learning of Linguistic Model for Text Classification

  • Mikawa, Kenta;Ishida, Takashi;Goto, Masayuki
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.1
    • /
    • pp.87-93
    • /
    • 2012
  • This paper discusses a new weighting method for text analyzing from the view point of supervised learning. The term frequency and inverse term frequency measure (tf-idf measure) is famous weighting method for information retrieval, and this method can be used for text analyzing either. However, it is an experimental weighting method for information retrieval whose effectiveness is not clarified from the theoretical viewpoints. Therefore, other effective weighting measure may be obtained for document classification problems. In this study, we propose the optimal weighting method for document classification problems from the view point of supervised learning. The proposed measure is more suitable for the text classification problem as used training data than the tf-idf measure. The effectiveness of our proposal is clarified by simulation experiments for the text classification problems of newspaper article and the customer review which is posted on the web site.

Design of One-Class Classifier Using Hyper-Rectangles (Hyper-Rectangles를 이용한 단일 분류기 설계)

  • Jeong, In Kyo;Choi, Jin Young
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.41 no.5
    • /
    • pp.439-446
    • /
    • 2015
  • Recently, the importance of one-class classification problem is more increasing. However, most of existing algorithms have the limitation on providing the information that effects on the prediction of the target value. Motivated by this remark, in this paper, we suggest an efficient one-class classifier using hyper-rectangles (H-RTGLs) that can be produced from intervals including observations. Specifically, we generate intervals for each feature and integrate them. For generating intervals, we consider two approaches : (i) interval merging and (ii) clustering. We evaluate the performance of the suggested methods by computing classification accuracy using area under the roc curve and compare them with other one-class classification algorithms using four datasets from UCI repository. Since H-RTGLs constructed for a given data set enable classification factors to be visible, we can discern which features effect on the classification result and extract patterns that a data set originally has.

Online Selective-Sample Learning of Hidden Markov Models for Sequence Classification

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.145-152
    • /
    • 2015
  • We consider an online selective-sample learning problem for sequence classification, where the goal is to learn a predictive model using a stream of data samples whose class labels can be selectively queried by the algorithm. Given that there is a limit to the total number of queries permitted, the key issue is choosing the most informative and salient samples for their class labels to be queried. Recently, several aggressive selective-sample algorithms have been proposed under a linear model for static (non-sequential) binary classification. We extend the idea to hidden Markov models for multi-class sequence classification by introducing reasonable measures for the novelty and prediction confidence of the incoming sample with respect to the current model, on which the query decision is based. For several sequence classification datasets/tasks in online learning setups, we demonstrate the effectiveness of the proposed approach.

A Data-centric Analysis to Evaluate Suitable Machine-Learning-based Network-Attack Classification Schemes

  • Huong, Truong Thu;Bac, Ta Phuong;Thang, Bui Doan;Long, Dao Minh;Quang, Le Anh;Dan, Nguyen Minh;Hoang, Nguyen Viet
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.6
    • /
    • pp.169-180
    • /
    • 2021
  • Since machine learning was invented, there have been many different machine learning-based algorithms, from shallow learning to deep learning models, that provide solutions to the classification tasks. But then it poses a problem in choosing a suitable classification algorithm that can improve the classification/detection efficiency for a certain network context. With that comes whether an algorithm provides good performance, why it works in some problems and not in others. In this paper, we present a data-centric analysis to provide a way for selecting a suitable classification algorithm. This data-centric approach is a new viewpoint in exploring relationships between classification performance and facts and figures of data sets.

Naive Bayes classifiers boosted by sufficient dimension reduction: applications to top-k classification

  • Yang, Su Hyeong;Shin, Seung Jun;Sung, Wooseok;Lee, Choon Won
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.603-614
    • /
    • 2022
  • The naive Bayes classifier is one of the most straightforward classification tools and directly estimates the class probability. However, because it relies on the independent assumption of the predictor, which is rarely satisfied in real-world problems, its application is limited in practice. In this article, we propose employing sufficient dimension reduction (SDR) to substantially improve the performance of the naive Bayes classifier, which is often deteriorated when the number of predictors is not restrictively small. This is not surprising as SDR reduces the predictor dimension without sacrificing classification information, and predictors in the reduced space are constructed to be uncorrelated. Therefore, SDR leads the naive Bayes to no longer be naive. We applied the proposed naive Bayes classifier after SDR to build a recommendation system for the eyewear-frames based on customers' face shape, demonstrating its utility in the top-k classification problem.

Data-Adaptive ECOC for Multicategory Classification

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.1
    • /
    • pp.25-36
    • /
    • 2008
  • Error Correcting Output Codes (ECOC) can improve generalization performance when applied to multicategory classification problem. In this study we propose a new criterion to select hyperparameters included in ECOC scheme. Instead of margins of a data we propose to use the probability of misclassification error since it makes the criterion simple. Using this we obtain an upper bound of leave-one-out error of OVA(one vs all) method. Our experiments from real and synthetic data indicate that the bound leads to good estimates of parameters.

  • PDF

WHEN CAN SUPPORT VECTOR MACHINE ACHIEVE FAST RATES OF CONVERGENCE?

  • Park, Chang-Yi
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.3
    • /
    • pp.367-372
    • /
    • 2007
  • Classification as a tool to extract information from data plays an important role in science and engineering. Among various classification methodologies, support vector machine has recently seen significant developments. The central problem this paper addresses is the accuracy of support vector machine. In particular, we are interested in the situations where fast rates of convergence to the Bayes risk can be achieved by support vector machine. Through learning examples, we illustrate that support vector machine may yield fast rates if the space spanned by an adopted kernel is sufficiently large.