• Title/Summary/Keyword: Principal component biplot

Search Result 19, Processing Time 0.026 seconds

Independent Component Biplot (독립성분 행렬도)

  • Lee, Su Jin;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.1
    • /
    • pp.31-41
    • /
    • 2014
  • Biplot is a useful graphical method to simultaneously explore the rows and columns of a two-way data matrix. In particular, principal component factor biplot is a graphical method to describe the interrelationship among many variables in terms of a few underlying but unobservable random variables called factors. If we consider the unobservable variables (which are mutually independent and also non-Gaussian), we can apply the independent component analysis decomposing a mixture of non-Gaussian in its independent components. In this case, if we apply the principal component factor analysis, we cannot clearly describe the interrelationship among many variables. Therefore, in this study, we apply the independent component analysis of Jutten and Herault (1991) decomposing a mixture of non-Gaussian in its independent components. We suggest an independent component biplot to interpret the independent component analysis graphically.

Resistant h-Plot for a Sample Variance-Covariance Matrix

  • Park, Yong-Seok
    • Journal of the Korean Statistical Society
    • /
    • v.24 no.2
    • /
    • pp.407-417
    • /
    • 1995
  • The h-plot is a graphical technique for displaying the structure of one population's variance-covariance matrix. This follows the mathematical algorithem of the principle component biplot based on the singular value decomposition. But it is known that the singular value decomposition is not resistant, i.e., it is very sensitive to small changes in the input data. In this article, since the mathematical algorithm of the h-plot is equivalent to that of principal component biplot of Choi and Huh (1994), we derive the resistant h-plot.

  • PDF

Comparison of Shape Variability in Principal Component Biplot with Missing Values

  • Shin, Sang-Min;Choi, Yong-Seok;Lee, Nae-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.1109-1116
    • /
    • 2008
  • Biplots are the multivariate analogue of scatter plots. They are useful for giving a graphical description of the data matrix, for detecting patterns and for displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data matrix, most biplots are not directly applicable. In particular, we are interested in the shape variability of principal component biplot which is the most popular in biplots with missing values. For this, we estimate the missing data using the EM algorithm and mean imputation according to missing rates. Even though we estimate missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we propose a RMS(root mean square) for measuring and comparing the shape variability between the original biplots and the estimated biplots.

Resistant Singular Value Decomposition and Its Statistical Applications

  • Park, Yong-Seok;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.1
    • /
    • pp.49-66
    • /
    • 1996
  • The singular value decomposition is one of the most useful methods in the area of matrix computation. It gives dimension reduction which is the centeral idea in many multivariate analyses. But this method is not resistant, i.e., it is very sensitive to small changes in the input data. In this article, we derive the resistant version of singular value decomposition for principal component analysis. And we give its statistical applications to biplot which is similar to principal component analysis in aspects of the dimension reduction of an n x p data matrix. Therefore, we derive the resistant principal component analysis and biplot based on the resistant singular value decomposition. They provide graphical multivariate data analyses relatively little influenced by outlying observations.

  • PDF

SVM-Guided Biplot of Observations and Variables

  • Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.6
    • /
    • pp.491-498
    • /
    • 2013
  • We consider support vector machines(SVM) to predict Y with p numerical variables $X_1$, ${\ldots}$, $X_p$. This paper aims to build a biplot of p explanatory variables, in which the first dimension indicates the direction of SVM classification and/or regression fits. We use the geometric scheme of kernel principal component analysis adapted to map n observations on the two-dimensional projection plane of which one axis is determined by a SVM model a priori.

Arrow Diagrams for Kernel Principal Component Analysis

  • Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.3
    • /
    • pp.175-184
    • /
    • 2013
  • Kernel principal component analysis(PCA) maps observations in nonlinear feature space to a reduced dimensional plane of principal components. We do not need to specify the feature space explicitly because the procedure uses the kernel trick. In this paper, we propose a graphical scheme to represent variables in the kernel principal component analysis. In addition, we propose an index for individual variables to measure the importance in the principal component plane.

Representing variables in the latent space (분석변수들의 잠재공간 표현)

  • Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.4
    • /
    • pp.555-566
    • /
    • 2017
  • For multivariate datasets with large number of variables, classical dimensional reduction methods such as principal component analysis may not be effective for data visualization. The underlying reason is that the dimensionality of the space of variables is often larger than two or three, while the visualization to the human eye is most effective with two or three dimensions. This paper proposes a working procedure which first partitions the variables into several "latent" clusters, explores individual data subsets, and finally integrates findings. We use R pakacage "ClustOfVar" for partitioning variables around latent dimensions and the principal component biplot method to visualize within-cluster patterns. Additionally, we use the technique for embedding supplementary variables to figure out the relationships between within-cluster variables and outside variables.

The Use of a Biplot in Studying the Career Maturity of College Freshmen (행렬도를 이용한 대학 신입생의 진로의식 분석)

  • Choi, Hye-Mi;Park, Chan-Yong;Lee, Sang-Hyeop;Chung, Sung-Suk
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.933-941
    • /
    • 2010
  • Biplot is a modern graphical methodology allowing for the projection of high-dimensional data to a low-dimensional subspace that is rich in information on variation in the data, correlation among variables as well as class separation. For the construction of biplots, we use a BiplotGUI package in a free statistical software R with increasing popularity. Moreover, using data from questionnaires given to Chonbuk National University freshmen in 2009, the relationship between career goals and career maturity are studied by applying the biplot method.

Additive Main Effects and Multiplicative Interaction Analysis of Host-Pathogen Relationship in Rice-Bacterial Blight Pathosystem

  • Nayak, D.;Bose, L.K.;Singh, S.;Nayak, P.
    • The Plant Pathology Journal
    • /
    • v.24 no.3
    • /
    • pp.337-351
    • /
    • 2008
  • Host-pathogen interaction in rice bacterial blight pathosystem was analyzed for a better understanding of their relationship and recognition of stable pathogenicity among the populations of Xanthomonas oryzae pv. oryzae. A total number of 52 bacterial strains isolated from diseased leaf samples collected from 12 rice growing states and one Union Territory of India, were inoculated on 16 rice varieties, each possessing known genes for resistance. Analysis of variance revealed that the host genotypes(G) accounted for largest(78.4%) proportion of the total sum of squares(SS), followed by 16.5% due to the pathogen isolates(I) and 5.1% due to the $I{\times}G$ interactions. Application of the Additive Main effects and Multiplicative Interaction(AMMI) model revealed that the first two interaction principal component axes(IPCA) accounted for 66.8% and 21.5% of the interaction SS, respectively. The biplot generated using the isolate and genotypic scores of the first two IPCAs revealed groups of host genotypes and pathogen isolates falling into four sectors. A group of five isolates with high virulence, high absolute IPCA-1 scores, moderate IPCA-2 scores, low AMMI stability index '$D_i$' values and minimal deviations from additive main effects displayed in AMMI biplot as well as response plot, were identified as possessing stable pathogenicity across 16 host genotypes. The largest group of 27 isolates with low virulence, small IPCA-1 as well as IPCA-2 scores, low $D_i$ values and minimal deviations from additive main effect predictions, possessed stable pathogenicity for low virulence. The AMMI analysis and biplot display facilitated in a better understanding of the host-pathogen interaction, adaptability of pathogen isolates to specific host genotypes, identification of isolates showing stable pathogenicity and most discriminating host genotypes, which could be useful in location specific breeding programs aiming at deployment of resistant host genotypes in bacterial blight disease control strategies.

A Comparison Study for Ordination Methods in Ecology (생태학의 통계적 서열화 방법 비교에 관한 연구)

  • Ko, Hyeon-Seok;Jhun, Myoungshic;Jeong, Hyeong Chul
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.1
    • /
    • pp.49-60
    • /
    • 2015
  • Various kinds of ordination methods such as correspondence analysis and canonical correspondence analysis are used in community ecology to visualize relationships among species, sites, and environmental variables. Ter Braak (1986), Jackson and Somers (1991), Parmer (1993), compared the ordination methods using eigenvalue and distance graph. However, these methods did not show the relationship between population and biplot because they are only based on surveyed data. In this paper, a method that measures the extent to show population information to biplot was introduced to compare ordination methods objectively.