• Title/Summary/Keyword: Data Principal

Search Result 2,078, Processing Time 0.029 seconds

A study on the properties of sensitivity analysis in principal component regression and latent root regression (주성분회귀와 고유값회귀에 대한 감도분석의 성질에 대한 연구)

  • Shin, Jae-Kyoung;Chang, Duk-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.321-328
    • /
    • 2009
  • In regression analysis, the ordinary least squares estimates of regression coefficients become poor, when the correlations among predictor variables are high. This phenomenon, which is called multicollinearity, causes serious problems in actual data analysis. To overcome this multicollinearity, many methods have been proposed. Ridge regression, shrinkage estimators and methods based on principal component analysis (PCA) such as principal component regression (PCR) and latent root regression (LRR). In the last decade, many statisticians discussed sensitivity analysis (SA) in ordinary multiple regression and same topic in PCR, LRR and logistic principal component regression (LPCR). In those methods PCA plays important role. Many statisticians discussed SA in PCA and related multivariate methods. We introduce the method of PCR and LRR. We also introduce the methods of SA in PCR and LRR, and discuss the properties of SA in PCR and LRR.

  • PDF

A Hashing Method Using PCA-based Clustering (PCA 기반 군집화를 이용한 해슁 기법)

  • Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.6
    • /
    • pp.215-218
    • /
    • 2014
  • In hashing-based methods for approximate nearest neighbors(ANN) search, by mapping data points to k-bit binary codes, nearest neighbors are searched in a binary embedding space. In this paper, we present a hashing method using a PCA-based clustering method, Principal Direction Divisive Partitioning(PDDP). PDDP is a clustering method which repeatedly partitions the cluster with the largest variance into two clusters by using the first principal direction. The proposed hashing method utilizes the first principal direction as a projective direction for binary coding. Experimental results demonstrate that the proposed method is competitive compared with other hashing methods.

A Study on the Compression and Major Pattern Extraction Method of Origin-Destination Data with Principal Component Analysis (주성분분석을 이용한 기종점 데이터의 압축 및 주요 패턴 도출에 관한 연구)

  • Kim, Jeongyun;Tak, Sehyun;Yoon, Jinwon;Yeo, Hwasoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.4
    • /
    • pp.81-99
    • /
    • 2020
  • Origin-destination data have been collected and utilized for demand analysis and service design in various fields such as public transportation and traffic operation. As the utilization of big data becomes important, there are increasing needs to store raw origin-destination data for big data analysis. However, it is not practical to store and analyze the raw data for a long period of time since the size of the data increases by the power of the number of the collection points. To overcome this storage limitation and long-period pattern analysis, this study proposes a methodology for compression and origin-destination data analysis with the compressed data. The proposed methodology is applied to public transit data of Sejong and Seoul. We first measure the reconstruction error and the data size for each truncated matrix. Then, to determine a range of principal components for removing random data, we measure the level of the regularity based on covariance coefficients of the demand data reconstructed with each range of principal components. Based on the distribution of the covariance coefficients, we found the range of principal components that covers the regular demand. The ranges are determined as 1~60 and 1~80 for Sejong and Seoul respectively.

The Comparison of Singular Value Decomposition and Spectral Decomposition

  • Shin, Yang-Gyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.1135-1143
    • /
    • 2007
  • The singular value decomposition and the spectral decomposition are the useful methods in the area of matrix computation for multivariate techniques such as principal component analysis and multidimensional scaling. These techniques aim to find a simpler geometric structure for the data points. The singular value decomposition and the spectral decomposition are the methods being used in these techniques for this purpose. In this paper, the singular value decomposition and the spectral decomposition are compared.

  • PDF

A Comparison on Independent Component Analysis and Principal Component Analysis -for Classification Analysis-

  • Kim, Dae-Hak;Lee, Ki-Lak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.717-724
    • /
    • 2005
  • We often extract a new feature from the original features for the purpose of reducing the dimensions of feature space and better classification. In this paper, we show feature extraction method based on independent component analysis can be used for classification. Entropy and mutual information are used for the selection of ordered features. Performance of classification based on independent component analysis is compared with principal component analysis for three real data sets.

  • PDF

Sensitivity Analysis in Principal Component Regression : Numerical Investigation (주성분회귀(主成分回歸)에서의 민감도분석(敏感度分析) : 수치적(數値的) 연구(硏究))

  • Shin, Jae-Kyoung;Tarumi, Tomoyuki;Tanaka, Yutaka
    • Journal of the Korean Data and Information Science Society
    • /
    • v.2
    • /
    • pp.1-9
    • /
    • 1991
  • Shin, Tarumi and Tanaka(1989) discussed a method of sensitivity analysis in principal component regression(PCR) based on an influence function derived by Tanaka(1988). The present paper is its continuation. In this paper we first consider two new influence measures, then apply the proposed method to various data sets and discuss some properties of sensitivity analysis in PCR.

  • PDF

Unified Non-iterative Algorithm for Principal Component Regression, Partial Least Squares and Ordinary Least Squares

  • Kim, Jong-Duk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.2
    • /
    • pp.355-366
    • /
    • 2003
  • A unified procedure for principal component regression (PCR), partial least squares (PLS) and ordinary least squares (OLS) is proposed. The process gives solutions for PCR, PLS and OLS in a unified and non-iterative way. This enables us to see the interrelationships among the three regression coefficient vectors, and it is seen that the so-called E-matrix in the solution expression plays the key role in differentiating the methods. In addition to setting out the procedure, the paper also supplies a robust numerical algorithm for its implementation, which is used to show how the procedure performs on a real world data set.

  • PDF

On Useful Principal Component Features for EEG Classification (뇌파 분류에 유용한 주성분 특징)

  • Park, Sungcheol;Lee, Hyekyoung;Park, Seungjin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.178-180
    • /
    • 2003
  • EEG-based brain computer interface(BCI) provides a new communication channel between human brain and computer. EEG data is a multivariate time series so that hidden Markov model (HMM) might be a good choice for classification. However EEG is very noisy data and contains artifacts, so useful features mr expected to improve the performance of HMM. In this paper we addresses the usefulness of principal component features with Hidden Markov model (HHM). We show that some selected principal component features can suppress small noises and artifacts, hence improves classification performance. Experimental study for the classification of EEG data during imagination of a left, right up or down hand movement confirms the validity of our proposed method.

  • PDF

Application of Principal Components Analysis Method to Wireless Sensor Network Based Structural Monitoring Systems

  • Congyi, Zhang;Mission, Jose Leo;Kim, Sung-Ho;Youk, Yui-Su;Kim, Hyeong-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.1
    • /
    • pp.11-17
    • /
    • 2008
  • Typical wireless sensor networks used in structural monitoring are continuous types wherein data transmission is progressive at all time that may include irrelevant and insignificant data and information. Continuous types of wireless monitoring systems often pose problems of handling large-sized data that may deteriorate the performance of the system. The proposed method is to suggest an event-triggered monitoring system that captures and transmits relevant data only. An error signal generated by the Principal Components Analysis (PCA) is utilized as an index for event detection and selective data transmission. With this new monitoring scheme, the remote server is relieved of unwanted data by receiving only relevant information from the wireless sensor networks. The performance of the proposed scheme was verified with simulation studies.