• Title/Summary/Keyword: One-Class SVM

Search Result 71, Processing Time 0.025 seconds

FSVM for Multi Class Classification (다중 클래스 분류를 위한 FSVM)

  • Lee, Sun-Young;Kim, Sung-Soo
    • Proceedings of the KIEE Conference
    • /
    • 2005.07d
    • /
    • pp.3004-3006
    • /
    • 2005
  • Support vector machine(SVM)은 입력 데이터를 두개의 다른 클래스로 구별하는 결정면을 학습과정을 통하여 구한다. 기존의 SVM은 단지 이차 클래스에 대하여 적용되어지나, 많은 응용분야에서 입력 데이터들은 몇 개의 다중 클래스로 분류해야 한다. 다중 클래스 분류 문제는 기존의 SVM을 사용할 수 있는 일반적으로 몇 개의 2차 문제로 분해하여 풀 수 있다. 실례로 one-against-all 방법을 적용하면, n 클래스 문제는 n 개의 두 클래스 문제로 변환 하여 풀 수 있다. 본 논문에서는 입력 패턴들을 다중 클래스로 분류 할 때 퍼지 소속도를 응용한 소프트 마진 알고리즘의 상한 경계값을 각 클래스에 따라 다르게 적용함으로써 기존의 SVM 보다 더 우수한 학습 능력을 가짐을 보였다.

  • PDF

Multi-class Cancer Classification by Integrating OVR SVMs based on Subsumption Architecture (포섭 구조기반 OVR SVM 결합을 통한 다중부류 암 분류)

  • Hong Jin-Hyuk;Cho Sung-Bae
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.37-39
    • /
    • 2006
  • 지지 벡터 기계(Support Vector Machine; SVM)는 기본적으로 이진분류를 위해 고안되었지만, 최근 다양한 분류기 생성전략과 결합전략이 고안되어 다중부류 분류에도 적용되고 있다. 본 논문에서는 OVR(One-Vs-Rest) 전략으로 생성된 SVM을 NB(Naive Bayes) 분류기를 이용하여 동적으로 구성함으로써, OVR SVM을 이용한 다중부류 분류 시스템에서 자주 발생하는 동점을 효과적으로 해결하는 방법은 제안한다. 이 방법을 유전발현 데이터를 이용한 다중부류 암 분류에 적용하였는데, 고차원의 데이터로부터 NB 분류기 구축에 유용한 유전자를 선택하기 위해 Pearson 상관계수를 사용하였다. 14개의 암 유형과 16,063개의 유전발현 수준을 가지는 대표적인 다중부류 암 분류 데이터인 GCM 암 데이터에 적용하여 제안하는 방법의 유용성을 확인하였다.

  • PDF

APPLICATION OF SUPPORT VECTOR MACHINE TO THE PREDICTION OF GEO-EFFECTIVE HALO CMES

  • Choi, Seong-Hwan;Moon, Yong-Jae;Vien, Ngo Anh;Park, Young-Deuk
    • Journal of The Korean Astronomical Society
    • /
    • v.45 no.2
    • /
    • pp.31-38
    • /
    • 2012
  • In this study we apply Support Vector Machine (SVM) to the prediction of geo-effective halo coronal mass ejections (CMEs). The SVM, which is one of machine learning algorithms, is used for the purpose of classification and regression analysis. We use halo and partial halo CMEs from January 1996 to April 2010 in the SOHO/LASCO CME Catalog for training and prediction. And we also use their associated X-ray flare classes to identify front-side halo CMEs (stronger than B1 class), and the Dst index to determine geo-effective halo CMEs (stronger than -50 nT). The combinations of the speed and the angular width of CMEs, and their associated X-ray classes are used for input features of the SVM. We make an attempt to find the best model by using cross-validation which is processed by changing kernel functions of the SVM and their parameters. As a result we obtain statistical parameters for the best model by using the speed of CME and its associated X-ray flare class as input features of the SVM: Accuracy=0.66, PODy=0.76, PODn=0.49, FAR=0.72, Bias=1.06, CSI=0.59, TSS=0.25. The performance of the statistical parameters by applying the SVM is much better than those from the simple classifications based on constant classifiers.

Credit Card Bad Debt Prediction Model based on Support Vector Machine (신용카드 대손회원 예측을 위한 SVM 모형)

  • Kim, Jin Woo;Jhee, Won Chul
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.233-250
    • /
    • 2012
  • In this paper, credit card delinquency means the possibility of occurring bad debt within the certain near future from the normal accounts that have no debt and the problem is to predict, on the monthly basis, the occurrence of delinquency 3 months in advance. This prediction is typical binary classification problem but suffers from the issue of data imbalance that means the instances of target class is very few. For the effective prediction of bad debt occurrence, Support Vector Machine (SVM) with kernel trick is adopted using credit card usage and payment patterns as its inputs. SVM is widely accepted in the data mining society because of its prediction accuracy and no fear of overfitting. However, it is known that SVM has the limitation in its ability to processing the large-scale data. To resolve the difficulties in applying SVM to bad debt occurrence prediction, two stage clustering is suggested as an effective data reduction method and ensembles of SVM models are also adopted to mitigate the difficulty due to data imbalance intrinsic to the target problem of this paper. In the experiments with the real world data from one of the major domestic credit card companies, the suggested approach reveals the superior prediction accuracy to the traditional data mining approaches that use neural networks, decision trees or logistics regressions. SVM ensemble model learned from T2 training set shows the best prediction results among the alternatives considered and it is noteworthy that the performance of neural networks with T2 is better than that of SVM with T1. These results prove that the suggested approach is very effective for both SVM training and the classification problem of data imbalance.

Simultaneous Optimization of Gene Selection and Tumor Classification Using Intelligent Genetic Algorithm and Support Vector Machine

  • Huang, Hui-Ling;Ho, Shinn-Ying
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.57-62
    • /
    • 2005
  • Microarray gene expression profiling technology is one of the most important research topics in clinical diagnosis of disease. Given thousands of genes, only a small number of them show strong correlation with a certain phenotype. To identify such an optimal subset from thousands of genes is intractable, which plays a crucial role when classify multiple-class genes express models from tumor samples. This paper proposes an efficient classifier design method to simultaneously select the most relevant genes using an intelligent genetic algorithm (IGA) and design an accurate classifier using Support Vector Machine (SVM). IGA with an intelligent crossover operation based on orthogonal experimental design can efficiently solve large-scale parameter optimization problems. Therefore, the parameters of SVM as well as the binary parameters for gene selection are all encoded in a chromosome to achieve simultaneous optimization of gene selection and the associated SVM for accurate tumor classification. The effectiveness of the proposed method IGA/SVM is evaluated using four benchmark datasets. It is shown by computer simulation that IGA/SVM performs better than the existing method in terms of classification accuracy.

  • PDF

One-class Classification based Fault Classification for Semiconductor Process Cyclic Signal (단일 클래스 분류기법을 이용한 반도체 공정 주기 신호의 이상분류)

  • Cho, Min-Young;Baek, Jun-Geol
    • IE interfaces
    • /
    • v.25 no.2
    • /
    • pp.170-177
    • /
    • 2012
  • Process control is essential to operate the semiconductor process efficiently. This paper consider fault classification of semiconductor based cyclic signal for process control. In general, process signal usually take the different pattern depending on some different cause of fault. If faults can be classified by cause of faults, it could improve the process control through a definite and rapid diagnosis. One of the most important thing is a finding definite diagnosis in fault classification, even-though it is classified several times. This paper proposes the method that one-class classifier classify fault causes as each classes. Hotelling T2 chart, kNNDD(k-Nearest Neighbor Data Description), Distance based Novelty Detection are used to perform the one-class classifier. PCA(Principal Component Analysis) is also used to reduce the data dimension because the length of process signal is too long generally. In experiment, it generates the data based real signal patterns from semiconductor process. The objective of this experiment is to compare between the proposed method and SVM(Support Vector Machine). Most of the experiments' results show that proposed method using Distance based Novelty Detection has a good performance in classification and diagnosis problems.

Novelty Detection Methods for Response Modeling (반응 모델링을 위한 이상탐지 기법)

  • Lee Hyeong-Ju;Jo Seong-Jun
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.1825-1831
    • /
    • 2006
  • 본 논문에서는 반응 모델링에서의 집단 불균형을 해소하기 위한 이상탐지 기법의 활용을 제안한다. DMEF4 데이터셋의 카탈로그 발송 작업에 대하여 두 가지의 이상탐지 기법, one-class support vector machine (1-SVM)과 learning vector quantization for novelty detection (LVQ-ND)을 적용하여 이진분류기법들과 비교한다. 반응률이 낮은 경우에는 이상 탐지 기법들이 더 높은 정확도를 보인 반면, 반응률이 상대적으로 높은 경우에는 오분류 비용을 조정한 SVM 기법이 가장 좋은 성능을 보였다. 또한, 이상탐지 기법들은 발송비용이 낮은 경우에 높은 이익을 달성하였고, 발송비용이 높은 경우에는 SVM 모델이 가장 높은 이익을 달성하였다.

  • PDF

Fault Diagnosis Management Model using Machine Learning

  • Yang, Xitong;Lee, Jaeseung;Jung, Heokyung
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.2
    • /
    • pp.128-134
    • /
    • 2019
  • Based on the concept of Industry 4.0, various sensors are attached to facilities and equipment to collect data in real time and diagnose faults using analyzing techniques. Diagnostic technology continuously monitors faults or performance degradation of facilities and equipment in operation and diagnoses abnormal symptoms to ensure safety and availability through maintenance before failure occurs. In this paper, we propose a model to analyze the data and diagnose the state or failure using machine learning. The diagnosis model is based on a support vector machine (SVM)-based diagnosis model and a self-learning one-class SVM-based diagnostic model. In the future, it is expected that this model can be applied to facilities used in the entire industry by applying the actual data to the diagnostic model proposed in this paper, conducting the experiment, and verifying it through the model performance evaluation index.

A Study on Novelty Detection of GPS Data Using Human Mobility and OCSVM(One-class SVM) (OCSVM(One-class SVM)과 인간의 이동을 이용한 GPS 데이터의 이상 현상 검출에 관한연구)

  • Kim, Woo-Joong;Song, Ha-Yoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1060-1063
    • /
    • 2011
  • 인간은 목적지를 향하여 가는 방법의 선택에 있어서 가고자 하는 목적, 목적지, 출발 시간 등에 영향을 받는다. 그러나 이러한 매개변수들과 더불어 중요하게 고려되는 것은 바로 인간의 습관이다. 다시 말해 인간이 목적지로 가는 방법을 선택하는데 습관이라는 매개변수와 밀접한 영향이 있다는 것이다. 이를 미루어 볼 때, 인간의 이동은 습관으로 인해 대부분 특정한 범주 안에서 이동을 할 것이라는 추측할 수 있다. 나아가, 사람들이 흔히 들고 다니는 GPS장치에서 측정된 데이터가 추측한 속성으로 인해 범주를 벗어나는 이상현상을 검출하는 것으로 확장을 할 수 있다. 즉, GPS장치에서 측정된 데이터는 개인별로 클래스화(Classification)가 가능함을 추론할 수 있다. 본 논문에서는 실제 사람이 이동한 좌표를 바탕으로 시간당 변화량을 계산하여 좌표에 사상시켰다. 그리고, 단일 클래스 서포트 백터 머신(OCSVM)을 가지고 클래스화 했으며, OCSVM의 커널 함수 내의 변수인에 따라 클래스의 크기 혹은 클래스 내부의 밀도에 영향을 받음을 알 수 있었으며, 그 둘 사이에는 적절한 교환(Tradeoff)이 발생하였다는 결론이 나왔다.

Vibration Data Denoising and Performance Comparison Using Denoising Auto Encoder Method (Denoising Auto Encoder 기법을 활용한 진동 데이터 전처리 및 성능비교)

  • Jang, Jun-gyo;Noh, Chun-myoung;Kim, Sung-soo;Lee, Soon-sup;Lee, Jae-chul
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.7
    • /
    • pp.1088-1097
    • /
    • 2021
  • Vibration data of mechanical equipment inevitably have noise. This noise adversely af ects the maintenance of mechanical equipment. Accordingly, the performance of a learning model depends on how effectively the noise of the data is removed. In this study, the noise of the data was removed using the Denoising Auto Encoder (DAE) technique which does not include the characteristic extraction process in preprocessing time series data. In addition, the performance was compared with that of the Wavelet Transform, which is widely used for machine signal processing. The performance comparison was conducted by calculating the failure detection rate. For a more accurate comparison, a classification performance evaluation criterion, the F-1 Score, was calculated. Failure data were detected using the One-Class SVM technique. The performance comparison, revealed that the DAE technique performed better than the Wavelet Transform technique in terms of failure diagnosis and error rate.