• Title/Summary/Keyword: kernel discriminant analysis

Search Result 25, Processing Time 0.037 seconds

Support Vector Bankruptcy Prediction Model with Optimal Choice of RBF Kernel Parameter Values using Grid Search (Support Vector Machine을 이용한 부도예측모형의 개발 -격자탐색을 이용한 커널 함수의 최적 모수 값 선정과 기존 부도예측모형과의 성과 비교-)

  • Min Jae H.;Lee Young-Chan
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.30 no.1
    • /
    • pp.55-74
    • /
    • 2005
  • Bankruptcy prediction has drawn a lot of research interests in previous literature, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper employs a relatively new machine learning technique, support vector machines (SVMs). to bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, we use grid search technique using 5-fold cross-validation to find out the optimal values of the parameters of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM. we compare its performance with multiple discriminant analysis (MDA), logistic regression analysis (Logit), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.

Face recognition invariant to partial occlusions

  • Aisha, Azeem;Muhammad, Sharif;Hussain, Shah Jamal;Mudassar, Raza
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.7
    • /
    • pp.2496-2511
    • /
    • 2014
  • Face recognition is considered a complex biometrics in the field of image processing mainly due to the constraints imposed by variation in the appearance of facial images. These variations in appearance are affected by differences in expressions and/or occlusions (sunglasses, scarf etc.). This paper discusses incremental Kernel Fisher Discriminate Analysis on sub-classes for dealing with partial occlusions and variant expressions. This framework focuses on the division of classes into fixed size sub-classes for effective feature extraction. For this purpose, it modifies the traditional Linear Discriminant Analysis into incremental approach in the kernel space. Experiments are performed on AR, ORL, Yale B and MIT-CBCL face databases. The results show a significant improvement in face recognition.

Corporate credit rating prediction using support vector machines

  • Lee, Yong-Chan
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.571-578
    • /
    • 2005
  • Corporate credit rating analysis has drawn a lot of research interests in previous studies, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the corporate credit rating problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, the researcher uses a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, the researcher compares its performance with those of multiple discriminant analysis (MDA), case-based reasoning (CBR), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.

  • PDF

Study on Faults Diagnosis of Induction Motor Using KPCA Feature Extraction Technique (KPCA 특징추출기법을 이용한 유도전동기 결함 진단 연구)

  • Han, Sang-Bo;Hwang, Don-Ha;Kang, Dong-Sik
    • Proceedings of the KIEE Conference
    • /
    • 2007.07a
    • /
    • pp.1063-1064
    • /
    • 2007
  • 본 연구는 유도전동기 진단시스템을 개발하기 위하여 테스트 전동기 내부에 취부된 자속센서 신호를 사용한 알고리즘 적용 결과를 논한 것으로서 분류기별 고장 판별 정확도에 대하여 서술하였다. 특징추출은 Kernel Principal Component Analysis (KPCA) 방법을 이용 하였으며, 테스트 샘플들에 대해서는 LDA(Linear Discriminant Analysis)와 k-NN(k-Nearest neighbors) 분류기법을 이용하여 판별하였다. 회전자 바 손상이나 편심(동적/정적)인 경우는 두 가지 분류기 모두 95[%]이상의 높은 분류 정확도를 보였지만, LDA인 경우 정상상태를 비롯한 베이링 불량이나, 샤프트 변형인 경우는 낮은 분류율을 보였다.

  • PDF

Malicious Code Detection using the Effective Preprocessing Method Based on Native API (Native API 의 효과적인 전처리 방법을 이용한 악성 코드 탐지 방법에 관한 연구)

  • Bae, Seong-Jae;Cho, Jae-Ik;Shon, Tae-Shik;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.4
    • /
    • pp.785-796
    • /
    • 2012
  • In this paper, we propose an effective Behavior-based detection technique using the frequency of system calls to detect malicious code, when the number of training data is fewer than the number of properties on system calls. In this study, we collect the Native APIs which are Windows kernel data generated by running program code. Then we adopt the normalized freqeuncy of Native APIs as the basic properties. In addition, the basic properties are transformed to new properties by GLDA(Generalized Linear Discriminant Analysis) that is an effective method to discriminate between malicious code and normal code, although the number of training data is fewer than the number of properties. To detect the malicious code, kNN(k-Nearest Neighbor) classification, one of the bayesian classification technique, was used in this paper. We compared the proposed detection method with the other methods on collected Native APIs to verify efficiency of proposed method. It is presented that proposed detection method has a lower false positive rate than other methods on the threshold value when detection rate is 100%.

Real-time Fault Diagnosis of Induction Motor Using Clustering and Radial Basis Function (클러스터링과 방사기저함수 네트워크를 이용한 실시간 유도전동기 고장진단)

  • Park, Jang-Hwan;Lee, Dae-Jong;Chun, Myung-Geun
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.20 no.6
    • /
    • pp.55-62
    • /
    • 2006
  • For the fault diagnosis of three-phase induction motors, we construct a experimental unit and then develop a diagnosis algorithm based on pattern recognition. The experimental unit consists of machinery module for induction motor drive and data acquisition module to obtain the fault signal. As the first step for diagnosis procedure, preprocessing is performed to make the acquired current simplified and normalized. To simplify the data, three-phase current is transformed into the magnitude of Concordia vector. As the next step, feature extraction is performed by kernel principal component analysis(KPCA) and linear discriminant analysis(LDA). Finally, we used the classifier based on radial basis function(RBF) network. To show the effectiveness, the proposed diagnostic system has been intensively tested with the various data acquired under different electrical and mechanical faults with varying load.

Identifying Causes of Industrial Process Faults Using Nonlinear Statistical Approach (공정 이상원인의 비선형 통계적 방법을 통한 진단)

  • Cho, Hyun-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.8
    • /
    • pp.3779-3784
    • /
    • 2012
  • Real-time process monitoring and diagnosis of industrial processes is one of important operational tasks for quality and safety reasons. The objective of fault diagnosis or identification is to find process variables responsible for causing a specific fault in the process. This helps process operators to investigate root causes more effectively. This work assesses the applicability of combining a nonlinear statistical technique of kernel Fisher discriminant analysis with a preprocessing method as a tool of on-line fault identification. To compare its performance to existing linear principal component analysis (PCA) identification scheme, a case study on a benchmark process was performed to show that the fault identification scheme produced more reliable diagnosis results than linear method.

Hybrid Learning Architectures for Advanced Data Mining:An Application to Binary Classification for Fraud Management (개선된 데이터마이닝을 위한 혼합 학습구조의 제시)

  • Kim, Steven H.;Shin, Sung-Woo
    • Journal of Information Technology Application
    • /
    • v.1
    • /
    • pp.173-211
    • /
    • 1999
  • The task of classification permeates all walks of life, from business and economics to science and public policy. In this context, nonlinear techniques from artificial intelligence have often proven to be more effective than the methods of classical statistics. The objective of knowledge discovery and data mining is to support decision making through the effective use of information. The automated approach to knowledge discovery is especially useful when dealing with large data sets or complex relationships. For many applications, automated software may find subtle patterns which escape the notice of manual analysis, or whose complexity exceeds the cognitive capabilities of humans. This paper explores the utility of a collaborative learning approach involving integrated models in the preprocessing and postprocessing stages. For instance, a genetic algorithm effects feature-weight optimization in a preprocessing module. Moreover, an inductive tree, artificial neural network (ANN), and k-nearest neighbor (kNN) techniques serve as postprocessing modules. More specifically, the postprocessors act as second0order classifiers which determine the best first-order classifier on a case-by-case basis. In addition to the second-order models, a voting scheme is investigated as a simple, but efficient, postprocessing model. The first-order models consist of statistical and machine learning models such as logistic regression (logit), multivariate discriminant analysis (MDA), ANN, and kNN. The genetic algorithm, inductive decision tree, and voting scheme act as kernel modules for collaborative learning. These ideas are explored against the background of a practical application relating to financial fraud management which exemplifies a binary classification problem.

  • PDF

Power Signal Recognition with High Order Moment Features for Non-Intrusive Load Monitoring (비간섭 전력 부하 감시용 고차 적률 특징을 갖는 전력 신호 인식)

  • Min, Hwang-Ki;An, Taehun;Lee, Seungwon;Lee, Seong Ro;Song, Iickho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39C no.7
    • /
    • pp.608-614
    • /
    • 2014
  • A pattern recognition (PR) system is addressed for non-intrusive load monitoring. To effectively recognize two appliances (for example, an electric iron and a cook top), we propose a novel feature extraction method based on high order moments of power signals. Simulation results confirm that the PR system with the proposed high order moment features and kernel discriminant analysis can effectively separate two appliances.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.