• Title/Summary/Keyword: Biplot

Search Result 54, Processing Time 0.024 seconds

Resistant h-Plot for a Sample Variance-Covariance Matrix

  • Park, Yong-Seok
    • Journal of the Korean Statistical Society
    • /
    • v.24 no.2
    • /
    • pp.407-417
    • /
    • 1995
  • The h-plot is a graphical technique for displaying the structure of one population's variance-covariance matrix. This follows the mathematical algorithem of the principle component biplot based on the singular value decomposition. But it is known that the singular value decomposition is not resistant, i.e., it is very sensitive to small changes in the input data. In this article, since the mathematical algorithm of the h-plot is equivalent to that of principal component biplot of Choi and Huh (1994), we derive the resistant h-plot.

  • PDF

SVM-Guided Biplot of Observations and Variables

  • Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.6
    • /
    • pp.491-498
    • /
    • 2013
  • We consider support vector machines(SVM) to predict Y with p numerical variables $X_1$, ${\ldots}$, $X_p$. This paper aims to build a biplot of p explanatory variables, in which the first dimension indicates the direction of SVM classification and/or regression fits. We use the geometric scheme of kernel principal component analysis adapted to map n observations on the two-dimensional projection plane of which one axis is determined by a SVM model a priori.

Local Projective Display of Multivariate Numerical Data

  • Huh, Myung-Hoe;Lee, Yong-Goo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.661-668
    • /
    • 2012
  • For displaying multivariate numerical data on a 2D plane by the projection, principal components biplot and the GGobi are two main tools of data visualization. The biplot is very useful for capturing the global shape of the dataset, by representing $n$ observations and $p$ variables simultaneously on a single graph. The GGobi shows a dynamic movie of the images of $n$ observations projected onto a sequence of unit vectors floating on the $p$-dimensional sphere. Even though these two methods are certainly very valuable, there are drawbacks. The biplot is too condensed to describe the detailed parts of the data, and the GGobi is too burdensome for ordinary data analyses. In this paper, "the local projective display(LPD)" is proposed for visualizing multivariate numerical data. Main steps of the LDP are 1) $k$-means clustering of the data into $k$ subsets, 2) drawing $k$ principal components biplots of individual subsets, and 3) sequencing $k$ plots by Hurley's (2004) endlink algorithm for cognitive continuity.

Comparison of Shape Variability in Principal Component Biplot with Missing Values

  • Shin, Sang-Min;Choi, Yong-Seok;Lee, Nae-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.1109-1116
    • /
    • 2008
  • Biplots are the multivariate analogue of scatter plots. They are useful for giving a graphical description of the data matrix, for detecting patterns and for displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data matrix, most biplots are not directly applicable. In particular, we are interested in the shape variability of principal component biplot which is the most popular in biplots with missing values. For this, we estimate the missing data using the EM algorithm and mean imputation according to missing rates. Even though we estimate missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we propose a RMS(root mean square) for measuring and comparing the shape variability between the original biplots and the estimated biplots.

다반응값 자료에 대한 biplot의 활용에 관한 연구

  • 장대흥
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.1
    • /
    • pp.1-9
    • /
    • 1996
  • 반응표면분석에서 다반응값의 최적화 문제는 다반응값 최적화 문제보다 복잡하다. 이런 다반응값 문제에서 반응변수들이나 설명변수 상호간의 곤계나 중요성 등을 평가하는 것은 중요하다. 이러한 평가를 위하여 biplot가 유용한 그림도구로 쓰일 수 있다.

  • PDF

A Study on the Relationship between Player Characteristic Factors and Competitive Factors of Tennis Grand Slams Competition Using Canonical Correlation Biplot and Procrustes Analysis (테니스 그랜드슬램대회의 선수특성요인과 경기요인에 대한 분석연구 -정준상관 행렬도와 프로크러스티즈 분석의 응용-)

  • Choi, Tae-Hoon;Choi, Yong-Seok;Shin, Sang-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.855-864
    • /
    • 2009
  • Canonical correlation biplot is 2-dimensional plot for investigating the relationship between two sets of variables and the relationship between observations and variables in canonical correlation analysis graphically. Recently, Choi and Choi (2008) suggested a method for investigating the relationship between skill and competition score factors of KLPGA players using canonical correlation biplot and cluster analysis. analysis. Procrustes analysis is very useful tool for comparing shape between configurations. Therefore, in this study, we will provide a method for investigating the relationship between player characteristic factors and competitive factors of tennis grand slams competition using Canonical correlation biplot and Procrustes analysis.

Relationship between Physical Fitness and Basic Skill Factors for KTA Players Using the Partial Cannonical Correlation Biplot Removing the Linear Effect of the Set of Covariate Variables and Procrustes Analysis (공변량요인 효과를 제거한 편정준상관 행렬도와 프로크러스티즈 분석을 응용한 남자 테니스선수의 체력요인 및 기초기술요인에 대한 분석연구)

  • Choi, Tae-Hoon;Choi, Yong-Seok
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.1
    • /
    • pp.97-105
    • /
    • 2012
  • The generalized canonical correlation biplot is a 2-dimensional plot to graphically investigate the relationship between more than three sets of variables and the relationship between observations and variables. Recently, Choi and Choi (2010) investigated the relationship physique, physical fitness and basic skill factors of Korea Tennis Association(KTA) players of using this biplot; however we consider the set of covariate variables affecting the linearly on two sets of variables. In this case, if we apply the generalized canonical correlation biplot, we cannot clearly interpret the other two sets of variables due to the effect of the set of covariate variables. Moreover, Yeom and Choi (2011) provided partial canonical correlation analysis that removed the linear effect of the set of covariate variables on two sets of variables. In addition, Procrustes analysis is a useful tool for comparing shape between configurations. In this study, we will investigate the relationship between physical fitness and basic skill factors of KTA players of using a partial canonical correlation biplot and Procrustes analysis. We compare shapes and shape variabilities for the generalized, partial and simple canonical correlation biplots.

Exploratory Analysis of Gene Expression Data Using Biplot (행렬도를 이용한 유전자발현자료의 탐색적 분석)

  • Park, Mi-Ra
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.355-369
    • /
    • 2005
  • Genome sequencing and microarray technology produce ever-increasing amounts of complex data that needs statistical analysis. Visualization is an effective analytic technique that exploits the ability of the human brain to process large amounts of data. In this study, biplot approach applied to microarray data to see the relationship between genes and samples. The supplementary data method to classify new sample to known category is suggested. The methods are validated by applying it to well known microarray data such as Golub et al.(1999), Alizadeh et al.(2000), Ross et al.(2000). The results are compared to the results of several clustering methods. Modified graph which combine partitioning method and biplot is also suggested.

A Study on Shape Variability in Canonical Correlation Biplot with Missing Values (결측값이 있는 정준상관 행렬도의 형상변동 연구)

  • Hong, Hyun-Uk;Choi, Yong-Seok;Shin, Sang-Min;Ka, Chang-Wan
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.955-966
    • /
    • 2010
  • Canonical correlation biplot is a useful biplot for giving a graphical description of the data matrix which consists of the association between two sets of variables, for detecting patterns and displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data, most biplots are not directly applicable. To solve this problem, we estimate the missing data using the median, mean, EM algorithm and MCMC imputation methods according to missing rates. Even though we estimate the missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we use a RMS(root mean square) which was proposed by Shin et al. (2007) and PS(procrustes statistic) for measuring and comparing the shape variability between the original biplots and the estimated biplots.

Resistant Singular Value Decomposition and Its Statistical Applications

  • Park, Yong-Seok;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.1
    • /
    • pp.49-66
    • /
    • 1996
  • The singular value decomposition is one of the most useful methods in the area of matrix computation. It gives dimension reduction which is the centeral idea in many multivariate analyses. But this method is not resistant, i.e., it is very sensitive to small changes in the input data. In this article, we derive the resistant version of singular value decomposition for principal component analysis. And we give its statistical applications to biplot which is similar to principal component analysis in aspects of the dimension reduction of an n x p data matrix. Therefore, we derive the resistant principal component analysis and biplot based on the resistant singular value decomposition. They provide graphical multivariate data analyses relatively little influenced by outlying observations.

  • PDF