• Title/Summary/Keyword: vector data

Search Result 3,288, Processing Time 0.028 seconds

An Outlier Data Analysis using Support Vector Regression (Support Vector Regression을 이용한 이상치 데이터분석)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.6
    • /
    • pp.876-880
    • /
    • 2008
  • Outliers are the observations which are very larger or smaller than most observations in the given data set. These are shown by some sources. The result of the analysis with outliers may be depended on them. In general, we do data analysis after removing outliers. But, in data mining applications such as fraud detection and intrusion detection, outliers are included in training data because they have crucial information. In regression models, simple and multiple regression models need to eliminate outliers from given training data by standadized and studentized residuals to construct good model. In this paper, we use support vector regression(SVR) based on statistical teaming theory to analyze data with outliers in regression. We verify the improved performance of our work by the experiment using synthetic data sets.

Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process

  • Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.523-530
    • /
    • 2016
  • Most classical control charts assume that processes are serially independent, and autocorrelation among variables makes them unreliable. To address this issue, a variety of statistical approaches has been employed to estimate the serial structure of the process. In this paper, we propose a multioutput least squares support vector regression and apply it to construct a residual multivariate cumulative sum control chart for detecting changes in the process mean vector. Numerical studies demonstrate that the proposed multioutput least squares support vector regression based control chart provides more satisfying results in detecting small shifts in the process mean vector.

Issues Related to the Use of Time Series in Model Building and Analysis: Review Article

  • Wei, William W.S.
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.3
    • /
    • pp.209-222
    • /
    • 2015
  • Time series are used in many studies for model building and analysis. We must be very careful to understand the kind of time series data used in the analysis. In this review article, we will begin with some issues related to the use of aggregate and systematic sampling time series. Since several time series are often used in a study of the relationship of variables, we will also consider vector time series modeling and analysis. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series. Therefore, we will also discuss some issues related to vector time models. Understanding these issues is important when we use time series data in modeling and analysis, regardless of whether it is a univariate or multivariate time series.

Support Vector Machine based Cluster Merging (Support Vector Machines 기반의 클러스터 결합 기법)

  • Choi, Byung-In;Rhee, Frank Chung-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.3
    • /
    • pp.369-374
    • /
    • 2004
  • A cluster merging algorithm that merges convex clusters resulted by the Fuzzy Convex Clustering(FCC) method into non-convex clusters was proposed. This was achieved by proposing a fast and reliable distance measure between two convex clusters using Support Vector Machines(SVM) to improve accuracy and speed over other existing conventional methods. In doing so, it was possible to reduce cluster number without losing its representation of the data. In this paper, results for several data sets are given to show the validity of our distance measure and algorithm.

Modeling properties of self-compacting concrete: support vector machines approach

  • Siddique, Rafat;Aggarwal, Paratibha;Aggarwal, Yogesh;Gupta, S.M.
    • Computers and Concrete
    • /
    • v.5 no.5
    • /
    • pp.461-473
    • /
    • 2008
  • The paper explores the potential of Support Vector Machines (SVM) approach in predicting 28-day compressive strength and slump flow of self-compacting concrete. Total of 80 data collected from the exiting literature were used in present work. To compare the performance of the technique, prediction was also done using a back propagation neural network model. For this data-set, RBF kernel worked well in comparison to polynomial kernel based support vector machines and provide a root mean square error of 4.688 (MPa) (correlation coefficient=0.942) for 28-day compressive strength prediction and a root mean square error of 7.825 cm (correlation coefficient=0.931) for slump flow. Results obtained for RMSE and correlation coefficient suggested a comparable performance by Support Vector Machine approach to neural network approach for both 28-day compressive strength and slump flow prediction.

Selective Encryption Algorithm Using Hybrid Transform for GIS Vector Map

  • Van, Bang Nguyen;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Information Processing Systems
    • /
    • v.13 no.1
    • /
    • pp.68-82
    • /
    • 2017
  • Nowadays, geographic information system (GIS) is developed and implemented in many areas. A huge volume of vector map data has been accessed unlawfully by hackers, pirates, or unauthorized users. For this reason, we need the methods that help to protect GIS data for storage, multimedia applications, and transmission. In our paper, a selective encryption method is presented based on vertex randomization and hybrid transform in the GIS vector map. In the proposed algorithm, polylines and polygons are focused as the targets for encryption. Objects are classified in each layer, and all coordinates of the significant objects are encrypted by the key sets generated by using chaotic map before changing them in DWT, DFT domain. Experimental results verify the high efficiency visualization by low complexity, high security performance by random processes.

LS-SVM for large data sets

  • Park, Hongrak;Hwang, Hyungtae;Kim, Byungju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.549-557
    • /
    • 2016
  • In this paper we propose multiclassification method for large data sets by ensembling least squares support vector machines (LS-SVM) with principal components instead of raw input vector. We use the revised one-vs-all method for multiclassification, which is one of voting scheme based on combining several binary classifications. The revised one-vs-all method is performed by using the hat matrix of LS-SVM ensemble, which is obtained by ensembling LS-SVMs trained using each random sample from the whole large training data. The leave-one-out cross validation (CV) function is used for the optimal values of hyper-parameters which affect the performance of multiclass LS-SVM ensemble. We present the generalized cross validation function to reduce computational burden of leave-one-out CV functions. Experimental results from real data sets are then obtained to illustrate the performance of the proposed multiclass LS-SVM ensemble.

ECG Data Compression Technique Using Wavelet Transform and Vector Quantization on PMS-B Algorithm (웨이브렛 변환과 평균예측검색 알고리즘의 벡터양자화를 이용한 심전도 데이터 압축기법)

  • Eun, J.S.;Shin, J.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1996 no.11
    • /
    • pp.225-228
    • /
    • 1996
  • ECG data are used for the diagnostic purposes with many clinical situations, especially heart disease. In this paper, an efficient ECG data compression technique by wavelet transform and high-speed vector quantization on PMS-B algorithm is proposed. In general, ECG data compression techniques are divided into two categories: direct and transform methods. The direct data compression techniques are AZTEC, TP, CORTES, FAN and SAPA algorithms, besides the transform methods include K-L, Fourier, Walsh, and wavelet transforms. In this paper, we applied wavelet analysis to the ECG data. In particular, vector quantization on PMS-B algorithm to the wavelet coefficients in the higher frequency regions, but scalar quantized in the lower frequency regions by PCM. Finally, the quantized indices were compressed by LZW lossless entropy encoder. As the result of simulation, it turns out to get sufficient compression ratio while keeping clinically acceptable PRD.

  • PDF

Enhancing Gene Expression Classification of Support Vector Machines with Generative Adversarial Networks

  • Huynh, Phuoc-Hai;Nguyen, Van Hoa;Do, Thanh-Nghi
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.14-20
    • /
    • 2019
  • Currently, microarray gene expression data take advantage of the sufficient classification of cancers, which addresses the problems relating to cancer causes and treatment regimens. However, the sample size of gene expression data is often restricted, because the price of microarray technology on studies in humans is high. We propose enhancing the gene expression classification of support vector machines with generative adversarial networks (GAN-SVMs). A GAN that generates new data from original training datasets was implemented. The GAN was used in conjunction with nonlinear SVMs that efficiently classify gene expression data. Numerical test results on 20 low-sample-size and very high-dimensional microarray gene expression datasets from the Kent Ridge Biomedical and Array Expression repositories indicate that the model is more accurate than state-of-the-art classifying models.

Support Vector Median Regression

  • Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.1
    • /
    • pp.67-74
    • /
    • 2003
  • Median regression analysis has robustness properties which make it an attractive alternative to regression based on the mean. Support vector machine (SVM) is used widely in real-world regression tasks. In this paper, we propose a new SV median regression based on check function. And we illustrate how this proposed SVM performs and compare this with the SVM based on absolute deviation loss function.

  • PDF