• 제목/요약/키워드: kernel learning

검색결과 247건 처리시간 0.031초

A transductive least squares support vector machine with the difference convex algorithm

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권2호
    • /
    • pp.455-464
    • /
    • 2014
  • Unlabeled examples are easier and less expensive to obtain than labeled examples. Semisupervised approaches are used to utilize such examples in an eort to boost the predictive performance. This paper proposes a novel semisupervised classication method named transductive least squares support vector machine (TLS-SVM), which is based on the least squares support vector machine. The proposed method utilizes the dierence convex algorithm to derive nonconvex minimization solutions for the TLS-SVM. A generalized cross validation method is also developed to choose the hyperparameters that aect the performance of the TLS-SVM. The experimental results conrm the successful performance of the proposed TLS-SVM.

불균형 데이터의 효과적 학습을 위한 커널 퍼셉트론 부스팅 기법 (Kernel Perceptron Boosting for Effective Learning of Imbalanced Data)

  • 오장민;장병탁
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2001년도 봄 학술발표논문집 Vol.28 No.1 (B)
    • /
    • pp.304-306
    • /
    • 2001
  • 많은 실세계의 문제에서 일반적인 패턴 분류 알고리즘들은 데이터의 불균형 문제에 어려움을 겪는다. 각각의 학습 예제에 균등한 중요도를 부여하는 기존의 기법들은 문제의 특징을 제대로 파악하지 못하는 경우가 많다. 본 논문에서는 불균형 데이터 문제를 해결하기 위해 퍼셉트론에 기반한 부스팅 기법을 제안한다. 부스팅 기법은 학습을 어렵게 하는 데이터에 집중하여 앙상블 머신을 구축하는 기법이다. 부스팅 기법에서는 약학습기를 필요로 하는데 기존 퍼셉트론의 경우 문제에 따라 약학습기(weak learner)의 조건을 만족시키지 못하는 경우가 있을 수 있다. 이에 커널을 도입한 커널 퍼셉트론을 사용하여 학습기의 표현 능력을 높였다. Reuters-21578 문서 집합을 대상으로 한 문서 여과 문제에서 부스팅 기법은 다층신경망이나 나이브 베이스 분류기보다 우수한 성능을 보였으며, 인공 데이터 실험을 통하여 부스팅의 샘플링 경향을 분석하였다.

  • PDF

A Novel Image Classification Method for Content-based Image Retrieval via a Hybrid Genetic Algorithm and Support Vector Machine Approach

  • Seo, Kwang-Kyu
    • 반도체디스플레이기술학회지
    • /
    • 제10권3호
    • /
    • pp.75-81
    • /
    • 2011
  • This paper presents a novel method for image classification based on a hybrid genetic algorithm (GA) and support vector machine (SVM) approach which can significantly improve the classification performance for content-based image retrieval (CBIR). Though SVM has been widely applied to CBIR, it has some problems such as the kernel parameters setting and feature subset selection of SVM which impact the classification accuracy in the learning process. This study aims at simultaneously optimizing the parameters of SVM and feature subset without degrading the classification accuracy of SVM using GA for CBIR. Using the hybrid GA and SVM model, we can classify more images in the database effectively. Experiments were carried out on a large-size database of images and experiment results show that the classification accuracy of conventional SVM may be improved significantly by using the proposed model. We also found that the proposed model outperformed all the other models such as neural network and typical SVM models.

Corporate credit rating prediction using support vector machines

  • 이영찬
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2005년도 공동추계학술대회
    • /
    • pp.571-578
    • /
    • 2005
  • Corporate credit rating analysis has drawn a lot of research interests in previous studies, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the corporate credit rating problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, the researcher uses a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, the researcher compares its performance with those of multiple discriminant analysis (MDA), case-based reasoning (CBR), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.

  • PDF

Robust Real-time Intrusion Detection System

  • Kim, Byung-Joo;Kim, Il-Kon
    • Journal of Information Processing Systems
    • /
    • 제1권1호
    • /
    • pp.9-13
    • /
    • 2005
  • Computer security has become a critical issue with the rapid development of business and other transaction systems over the Internet. The application of artificial intelligence, machine learning and data mining techniques to intrusion detection systems has been increasing recently. But most research is focused on improving the classification performance of a classifier. Selecting important features from input data leads to simplification of the problem, and faster and more accurate detection rates. Thus selecting important features is an important issue in intrusion detection. Another issue in intrusion detection is that most of the intrusion detection systems are performed by off-line and it is not a suitable method for a real-time intrusion detection system. In this paper, we develop the real-time intrusion detection system, which combines an on-line feature extraction method with the Least Squares Support Vector Machine classifier. Applying the proposed system to KDD CUP 99 data, experimental results show that it has a remarkable feature extraction and classification performance compared to existing off-line intrusion detection systems.

기계학습 기반의 토양함수 예측 기법 개발 (용담댐 시험유역을 중심으로) (Estimating soil moisture using machine learning approach: A Case Study to Yongdam watershed)

  • 응웬딘휘;권현한
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2018년도 학술발표회
    • /
    • pp.167-167
    • /
    • 2018
  • 토양수분은 토양에 포함된 평균 수분량을 나타내며 수문 순환 관점에서 매우 중요한 수문변량 중 하나이다. 본 연구에서는 대표적인 기계학습 방법인 Support Vector Machine (SVM)을 이용한 토양 함수 예측 기법을 개발하고자 하며, 예측인자로서 원격 탐측 기반의 토양함수자료, 강수량, 온도 등을 활용하고자 한다. SVM은 Kernel 함수를 이용하여 복잡한 비선형 관계를 선형 가정을 통해서 해석하는 기계학습 방법으로서 전역모델(global model)로서 다양한 수문기상분야에 적용이 이루어지고 있다. SVM의 장점은 일정 부분의 오차를 허용함으로서 모형의 일반화 측면에서 기존 인공신경망(artificial neural network, ANN)에 비해 우수한 성능을 나타내며, 특히 예측모형으로서 적용성이 매우 크다. 본 연구에서는 과거 토양 함수 자료와 강수, 온도, 위성 관측 기반 정보 등을 이용하여 모형을 적합시키고 이를 미계측 유역으로 확장하는데 연구의 목적이 있으며, 본 연구를 통해 제안된 모형은 용담댐 시험유역을 대상으로 적용되며 기존 ANN 모형 및 다중회귀분석 결과와 비교를 통해 모형의 적합성을 평가하고자한다.

  • PDF

Enhance Health Risks Prediction Mechanism in the Cloud Using RT-TKRIBC Technique

  • Konduru, Venkateswara Raju;Bharamgoudra, Manjula R
    • Journal of information and communication convergence engineering
    • /
    • 제19권3호
    • /
    • pp.166-174
    • /
    • 2021
  • A large volume of patient data is generated from various devices used in healthcare applications. With increase in the volume of data generated in the healthcare industry, more wellness monitoring is required. A cloud-enabled analysis of healthcare data that predicts patient risk factors is required. Machine learning techniques have been developed to address these medical care problems. A novel technique called the radix-trie-based Tanimoto kernel regressive infomax boost classification (RT-TKRIBC) technique is introduced to analyze the heterogeneous health data in the cloud to predict the health risks and send alerts. The infomax boost ensemble technique improves the prediction accuracy by finding the maximum mutual information, thereby minimizing the mean square error. The performance evaluation of the proposed RT-TKRIBC technique is realized through extensive simulations in the cloud environment, which provides better prediction accuracy and less prediction time than those provided by the state-of-the-art methods.

머신러닝을 이용한 한국프로야구 관중 수 예측모델 (Prediction Model of the Number of Spectators in Korean Baseball League Using Machine Learning)

  • 서원빈;길이만
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2019년도 춘계학술대회
    • /
    • pp.330-333
    • /
    • 2019
  • 본 연구는 기존 관중 수 예측에 주로 사용되는 ARIMA 모형과 다른 GKFN(Network with Gaussian kernel functions) 모델을 시계열 모델로 제안하고 여러 변수 간의 상관관계를 분석한 MLP(Multilayer Perceptron) 모델을 각각 따로 만들어 두 가지 RMSE값의 가중치를 결합한 새로운 모델을 최종적으로 제안한다. GKFN 모델은 phase space 분석을 위해 smoothness measure를 측정하고 커널 개수를 늘려가며 학습시키는 방법이다. 또한, MLP 모델은 관중 수에 영향을 주는 여러 변수(날짜, 날씨 등 팀과 관련된 특징들)의 상관관계를 correlation coefficient 값을 이용해 분석하고 높은 상관관계를 가지는 변수들을 이용해 MLP 모델을 만들어 학습하는 것이다. 이를 통해 프로야구팀 기아 타이거즈의 일일 단위 관중 수를 예측하고자 하였다. 관중 수 예측을 통해 구단과 관객 모두 긍정적인 활용이 가능할 것이다. 훈련 자료는 2010년부터 2018년까지 9년 동안 기아 타이거즈의 일별 관중 수를 자료로 하였다.

  • PDF

Respiratory Motion Correction on PET Images Based on 3D Convolutional Neural Network

  • Hou, Yibo;He, Jianfeng;She, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권7호
    • /
    • pp.2191-2208
    • /
    • 2022
  • Motion blur in PET (Positron emission tomography) images induced by respiratory motion will reduce the quality of imaging. Although exiting methods have positive performance for respiratory motion correction in medical practice, there are still many aspects that can be improved. In this paper, an improved 3D unsupervised framework, Res-Voxel based on U-Net network was proposed for the motion correction. The Res-Voxel with multiple residual structure may improve the ability of predicting deformation field, and use a smaller convolution kernel to reduce the parameters of the model and decrease the amount of computation required. The proposed is tested on the simulated PET imaging data and the clinical data. Experimental results demonstrate that the proposed achieved Dice indices 93.81%, 81.75% and 75.10% on the simulated geometric phantom data, voxel phantom data and the clinical data respectively. It is demonstrated that the proposed method can improve the registration and correction performance of PET image.

가우시언 과정의 회귀분석과 금융수학의 응용 (Gaussian Process Regression and Its Application to Mathematical Finance)

  • 임현철
    • 한국수학사학회지
    • /
    • 제35권1호
    • /
    • pp.1-18
    • /
    • 2022
  • This paper presents a statistical machine learning method that generates the implied volatility surface under the rareness of the market data. We apply the practitioner's Black-Scholes model and Gaussian process regression method to construct a Bayesian inference system with observed volatilities as a prior information and estimate the posterior distribution of the unobserved volatilities. The variance instead of the volatility is the target of the estimation, and the radial basis function is applied to the mean and kernel function of the Gaussian process regression. We present two types of Gaussian process regression methods and empirically analyze them.