Search | Korea Science

LS-SVM for large data sets

Park, Hongrak;Hwang, Hyungtae;Kim, Byungju
- Journal of the Korean Data and Information Science Society
- /
- v.27 no.2
- /
- pp.549-557
- /
- 2016
In this paper we propose multiclassification method for large data sets by ensembling least squares support vector machines (LS-SVM) with principal components instead of raw input vector. We use the revised one-vs-all method for multiclassification, which is one of voting scheme based on combining several binary classifications. The revised one-vs-all method is performed by using the hat matrix of LS-SVM ensemble, which is obtained by ensembling LS-SVMs trained using each random sample from the whole large training data. The leave-one-out cross validation (CV) function is used for the optimal values of hyper-parameters which affect the performance of multiclass LS-SVM ensemble. We present the generalized cross validation function to reduce computational burden of leave-one-out CV functions. Experimental results from real data sets are then obtained to illustrate the performance of the proposed multiclass LS-SVM ensemble.
https://doi.org/10.7465/jkdi.2016.27.2.549 인용 PDF KSCI

Retrieval of oceanic primary production using support vector machines

Tang, Shilin;Chen, Chuqun;Zhan, Haigang
- Proceedings of the KSRS Conference
- /
- v.1
- /
- pp.114-117
- /
- 2006
One of the most important tasks of ocean color observations is to determine the distribution of phytoplankton primary production. A variety of bio-optical algorithms have been developed estimate primary production from these parameters. In this communication, we investigated the possibility of using a novel universal approximator-support vector machines (SVMs)-as the nonlinear transfer function between oceanic primary production and the information that can be directly retrieved from satellite data. The VGPM (Vertically Generalized Production Model) dataset was used to evaluate the proposed approach. The PPARR2 (Primary Production Algorithm Round Robin 2) dataset was used to further compare the precision between the VGPM model and the SVM model. Using this SVM model to calculate the global ocean primary production, the result is 45.5 PgC $yr^{-1}$, which is a little higher than the VGPM result.
PDF

Sentiment Classification for Korean Tweets via Semi-Supervised Learning (준지도 학습을 이용한 트윗 감정 분류)

Seo, Hyeong-Won;Noh, Kyung-Mok;Cheon, Min-A;Kim, Jae-Hoon
- Annual Conference on Human and Language Technology
- /
- 2012.10a
- /
- pp.123-125
- /
- 2012
본 논문은 기계 학습을 이용한 감정 분류에 필요한 학습 말뭉치를 효율적으로 확장하는 방법에 대하여 기술한다. 학습 말뭉치는 일반적으로 그에 알맞은 레이블을 정해야 하는데, 그 양이 어마어마하기 때문에 이 과정을 일일이 사람이 할 수는 없다. 그에 대한 해결책으로써 이미 많은 준지도학습 방법이 연구되었고, 그것을 트윗이라는 짧은 문서를 감정 분류하는 것에 적용해도 감정 문서 분류기의 성능이 좋다는 결과를 확인하였다.
PDF

Target Detection and Navigation System for a mobile Robot

Kim, Il-Wan;Kwon, Ho-Sang;Kim, Young-Joong;Lim, Myo-Taeg
- 제어로봇시스템학회:학술대회논문집
- /
- 2005.06a
- /
- pp.2337-2341
- /
- 2005
This paper presents the target detection method using Support Vector Machines(SVMs) and the navigation system using behavior-based fuzzy controller. SVM is a machine-learning method based on the principle of structural risk minimization, which performs well when applied to data outside the training set. We formulate detection of target objects as a supervised-learning problem and apply SVM to detect at each location in the image whether a target object is present or not. The behavior-based fuzzy controller is implemented as an individual priority behavior: the highest level behavior is target-seeking, the middle level behavior is obstacle-avoidance, the lowest level is an emergency behavior. We have implemented and tested the proposed method in our mobile robot "Pioneer2-AT". Comparing with a neural-network based detection method, a SVM illustrate the excellence of the proposed method.
PDF

GMM-Based Maghreb Dialect Identification System

Nour-Eddine, Lachachi;Abdelkader, Adla
- Journal of Information Processing Systems
- /
- v.11 no.1
- /
- pp.22-38
- /
- 2015
While Modern Standard Arabic is the formal spoken and written language of the Arab world; dialects are the major communication mode for everyday life. Therefore, identifying a speaker's dialect is critical in the Arabic-speaking world for speech processing tasks, such as automatic speech recognition or identification. In this paper, we examine two approaches that reduce the Universal Background Model (UBM) in the automatic dialect identification system across the five following Arabic Maghreb dialects: Moroccan, Tunisian, and 3 dialects of the western (Oranian), central (Algiersian), and eastern (Constantinian) regions of Algeria. We applied our approaches to the Maghreb dialect detection domain that contains a collection of 10-second utterances and we compared the performance precision gained against the dialect samples from a baseline GMM-UBM system and the ones from our own improved GMM-UBM system that uses a Reduced UBM algorithm. Our experiments show that our approaches significantly improve identification performance over purely acoustic features with an identification rate of 80.49%.
https://doi.org/10.3745/JIPS.02.0015 인용 PDF KSCI

Multi-class Cancer Classification by Integrating OVR SVMs based on Subsumption Architecture (포섭 구조기반 OVR SVM 결합을 통한 다중부류 암 분류)

Hong Jin-Hyuk;Cho Sung-Bae
- Proceedings of the Korean Information Science Society Conference
- /
- 2006.06a
- /
- pp.37-39
- /
- 2006
지지 벡터 기계(Support Vector Machine; SVM)는 기본적으로 이진분류를 위해 고안되었지만, 최근 다양한 분류기 생성전략과 결합전략이 고안되어 다중부류 분류에도 적용되고 있다. 본 논문에서는 OVR(One-Vs-Rest) 전략으로 생성된 SVM을 NB(Naive Bayes) 분류기를 이용하여 동적으로 구성함으로써, OVR SVM을 이용한 다중부류 분류 시스템에서 자주 발생하는 동점을 효과적으로 해결하는 방법은 제안한다. 이 방법을 유전발현 데이터를 이용한 다중부류 암 분류에 적용하였는데, 고차원의 데이터로부터 NB 분류기 구축에 유용한 유전자를 선택하기 위해 Pearson 상관계수를 사용하였다. 14개의 암 유형과 16,063개의 유전발현 수준을 가지는 대표적인 다중부류 암 분류 데이터인 GCM 암 데이터에 적용하여 제안하는 방법의 유용성을 확인하였다.
PDF

Semiparametric Kernel Fisher Discriminant Approach for Regression Problems

Park, Joo-Young;Cho, Won-Hee;Kim, Young-Il
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.3 no.2
- /
- pp.227-232
- /
- 2003
Recently, support vector learning attracts an enormous amount of interest in the areas of function approximation, pattern classification, and novelty detection. One of the main reasons for the success of the support vector machines(SVMs) seems to be the availability of global and sparse solutions. Among the approaches sharing the same reasons for success and exhibiting a similarly good performance, we have KFD(kernel Fisher discriminant) approach. In this paper, we consider the problem of function approximation utilizing both predetermined basis functions and the KFD approach for regression. After reviewing support vector regression, semi-parametric approach for including predetermined basis functions, and the KFD regression, this paper presents an extension of the conventional KFD approach for regression toward the direction that can utilize predetermined basis functions. The applicability of the presented method is illustrated via a regression example.
https://doi.org/10.5391/IJFIS.2003.3.2.227 인용 PDF KSCI

Support Vector Bankruptcy Prediction Model with Optimal Choice of RBF Kernel Parameter Values using Grid Search (Support Vector Machine을 이용한 부도예측모형의 개발 -격자탐색을 이용한 커널 함수의 최적 모수 값 선정과 기존 부도예측모형과의 성과 비교-)

Min Jae H.;Lee Young-Chan
- Journal of the Korean Operations Research and Management Science Society
- /
- v.30 no.1
- /
- pp.55-74
- /
- 2005
Bankruptcy prediction has drawn a lot of research interests in previous literature, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper employs a relatively new machine learning technique, support vector machines (SVMs). to bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, we use grid search technique using 5-fold cross-validation to find out the optimal values of the parameters of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM. we compare its performance with multiple discriminant analysis (MDA), logistic regression analysis (Logit), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.
PDF KSCI

Weighted Support Vector Machines with the SCAD Penalty

Jung, Kang-Mo
- Communications for Statistical Applications and Methods
- /
- v.20 no.6
- /
- pp.481-490
- /
- 2013
Classification is an important research area as data can be easily obtained even if the number of predictors becomes huge. The support vector machine(SVM) is widely used to classify a subject into a predetermined group because it gives sound theoretical background and better performance than other methods in many applications. The SVM can be viewed as a penalized method with the hinge loss function and penalty functions. Instead of $L_2$ penalty function Fan and Li (2001) proposed the smoothly clipped absolute deviation(SCAD) satisfying good statistical properties. Despite the ability of SVMs, they have drawbacks of non-robustness when there are outliers in the data. We develop a robust SVM method using a weight function with the SCAD penalty function based on the local quadratic approximation. We compare the performance of the proposed SVM with the SVM using the $L_1$ and $L_2$ penalty functions.
https://doi.org/10.5351/CSAM.2013.20.6.481 인용 PDF KSCI

RISKY MODULE PREDICTION FOR NUCLEAR I&C SOFTWARE

Kim, Young-Mi;Kim, Hyeon-Soo
- Nuclear Engineering and Technology
- /
- v.44 no.6
- /
- pp.663-672
- /
- 2012
As software based digital I&C (Instrumentation and Control) systems are used more prevalently in nuclear plants, enhancement of software dependability has become an important issue in the area of nuclear I&C systems. Critical attributes of software dependability are safety and reliability. These attributes are tightly related to software failures caused by faults. Software testing and V&V (Verification and Validation) activities are hence important for enhancing software dependability. If the risky modules of safety-critical software can be predicted, it will be possible to focus on testing and V&V activities more efficiently and effectively. It should also make it possible to better allocate resources for regulation activities. We propose a prediction technique to estimate risky software modules by adopting machine learning models based on software complexity metrics. An empirical study with various machine learning algorithms was executed for comparing the prediction performance. Experimental results show SVMs (Support Vector Machines) perform as well or better than the other methods.
https://doi.org/10.5516/NET.04.2011.023 인용 PDF KSCI

Search Result 128, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)