• 제목/요약/키워드: multivariate data analysis

검색결과 1,397건 처리시간 0.025초

2000년 미국대선 플로리다주의 투표결과 분석 (Statistical Outliers in Florida Counties at the Presidential Election 2000)

  • 김현철
    • 응용통계연구
    • /
    • 제15권1호
    • /
    • pp.21-32
    • /
    • 2002
  • We searched out in the votes data of the State of Florida at presidential election 2000. We used a multivariate regression analysis. We got there were several outliers including Palm Beach County. It means that we should analyze the number of disqualified ballots which were double-punched as well as the votes, to insist the " Butterfly Ballot" made Palm Beach outlier.

REGIONAL CLASSIFICATION OF SHIZUOKA PREFECTURE WITH GIS BASED ON THE DATA OF WEATHER DISASTERS

  • HOTTA Asumi;IWASAKI Kazutaka
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2005년도 Proceedings of ISRS 2005
    • /
    • pp.65-68
    • /
    • 2005
  • In order for effective disaster prevention, it is necessary to have some idea of when, where, why and what kind of weather disasters may occur, and how large they may be. But the regional characteristics of Shizuoka Prefecture from the viewpoint of weather disasters have not been studied before. In this study, the authors gathered the data which represent how many times weather disasters occurred in Shizuoka Prefecture in the last fourteen years, and then divided it into some regions using a multivariate analysis. The authors adopted principal component analysis on this data, and then adopted cluster analysis with principal component scores which must be significant in the previous analysis. Finally the authors set the regional division based on these clusters and described the regional characteristics. This study could contribute to the weather disaster prevention in Shizuoka Prefecture.

  • PDF

Matrix Formation in Univariate and Multivariate General Linear Models

  • Arwa A. Alkhalaf
    • International Journal of Computer Science & Network Security
    • /
    • 제24권4호
    • /
    • pp.44-50
    • /
    • 2024
  • This paper offers an overview of matrix formation and calculation techniques within the framework of General Linear Models (GLMs). It takes a sequential approach, beginning with a detailed exploration of matrix formation and calculation methods in regression analysis and univariate analysis of variance (ANOVA). Subsequently, it extends the discussion to cover multivariate analysis of variance (MANOVA). The primary objective of this study was to provide a clear and accessible explanation of the underlying matrices that play a crucial role in GLMs. Through linking, essentially different statistical methods, by fundamental principles and algebraic foundations that underpin the GLM estimation. Insights presented here aim to assist researchers, statisticians, and data analysts in enhancing their understanding of GLMs and their practical implementation in diverse research domains. This paper contributes to a better comprehension of the matrix-based techniques that can be extended to GLMs.

Discriminant analysis using empirical distribution function

  • Kim, Jae Young;Hong, Chong Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권5호
    • /
    • pp.1179-1189
    • /
    • 2017
  • In this study, we propose an alternative method for discriminant analysis using a multivariate empirical distribution function to express multivariate data as a simple one-dimensional statistic. This method turns to be the estimation process of the optimal threshold based on classification accuracy measures and an empirical distribution function of data composed of classes. This can also be visually represented on a two-dimensional plane and discussed with some measures in ROC curves, surfaces, and manifolds. In order to explore the usefulness of this method for discriminant analysis in the study, we conducted comparisons between the proposed method and the existing methods through simulations and illustrative examples. It is found that the proposed method may have better performances for some cases.

Resistant Singular Value Decomposition and Its Statistical Applications

  • Park, Yong-Seok;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • 제25권1호
    • /
    • pp.49-66
    • /
    • 1996
  • The singular value decomposition is one of the most useful methods in the area of matrix computation. It gives dimension reduction which is the centeral idea in many multivariate analyses. But this method is not resistant, i.e., it is very sensitive to small changes in the input data. In this article, we derive the resistant version of singular value decomposition for principal component analysis. And we give its statistical applications to biplot which is similar to principal component analysis in aspects of the dimension reduction of an n x p data matrix. Therefore, we derive the resistant principal component analysis and biplot based on the resistant singular value decomposition. They provide graphical multivariate data analyses relatively little influenced by outlying observations.

  • PDF

순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측 (Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances)

  • 이범석;김성영
    • 한국가스학회지
    • /
    • 제11권3호
    • /
    • pp.13-18
    • /
    • 2007
  • 다변량 통계 분석법(Multivariate statistical analysis method)의 대표적 방법인 다중 선형 회귀법(Multiple linear regression. MLR)을 이용하여 2성분계 혼합물의 인화점을 회귀 분석하고 예측하였다. 가연성 물질의 인화점에 대한 예측은 실제 화학 공정 설계에서 화재 및 폭발 위험성을 판단하는 중요한 부분 중의 하나이다. 본 연구에서는 순수 성분의 물성 자료만을 이용하여 2성분계 혼합물의 인화점 실험 자료에 대해 다중 선형 회귀법(MLR)을 수행하였고, 이를 이용하여 새로운 혼합물에 대한 인화점을 예측하였다. 2성분계 혼합물의 인화점에 대한 MLR의 회귀 성능과 새로운 혼합물에 대한 예측 성능을 알아보기 위해, 기존의 인화점 추정 방법인 Raoult의 법칙과 Van Laar식에 의한 추정값과 비교해 보았다.

  • PDF

Mahalanobis Taguchi System을 이용한 다변량 시스템의 해석에 관한 연구 (Analysis of Multivariate System Using Mahalanobis Taguchi System)

  • 홍정의;권홍규
    • 산업경영시스템학회지
    • /
    • 제32권1호
    • /
    • pp.20-25
    • /
    • 2009
  • Mahalanobis Taguchi System (MTS) is a pattern information technology, which has been used in different diagnostic applications to make quantitative decisions by constructing a multivariate measurement scale using data analytic methods without any assumption regarding statistical distribution. The MTS performs Taguchi's fractional factorial design based on the Mahahlanobis Distance (MS) as a performance metric. In this work, MTS is used for analyzing Wisconsin Breast Cancer data which has ten attributes. Ten different tests are conducted for the data to determine if the patient has cancer or not. Also, MTS is used for reducing the number of test to define the relationship between each attribute and diagnosis result. The accuracy of diagnosis is compare with two different previous research.

Projection Pursuit K-Means Visual Clustering

  • Kim, Mi-Kyung;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • 제31권4호
    • /
    • pp.519-532
    • /
    • 2002
  • K-means clustering is a well-known partitioning method of multivariate observations. Recently, the method is implemented broadly in data mining softwares due to its computational efficiency in handling large data sets. However, it does not yield a suitable visual display of multivariate observations that is important especially in exploratory stage of data analysis. The aim of this study is to develop a K-means clustering method that enables visual display of multivariate observations in a low-dimensional space, for which the projection pursuit method is adopted. We propose a computationally inexpensive and reliable algorithm and provide two numerical examples.

UHPLC-DAD 및 다변량분석법을 이용한 참당귀의 산지감별법 연구 (Geographical Classification of Angelica gigas using UHPLC-DAD Combined Multivariate Analyses)

  • 김정률;이동영;성상현;김진웅
    • 생약학회지
    • /
    • 제44권4호
    • /
    • pp.332-335
    • /
    • 2013
  • Geographical classification of A. gigas was performed in the present study using UHPLC-DAD combined with multivariate data analysis techniques. Six active constituents were isolated from A. gigas; nodakenin, marmesin, decursinol, demethylsuberosin, decursin and decursinol angelate. One hundred sixty eight A. gigas samples were simultaneously determined using UHPLC-DAD. A principal component analysis (PCA) and partial least square discriminant analysis (PLS-DA) was used to classify the samples according to geographical origins (Korea and China). The origins of A. gigas from Korea and China were correctly classified by 81.6% and 93.8% using PLS-DA Y prediction. This result demonstrates the potential use of UHPLC-DAD combined with multivariate analysis techniques as an accurate and rapid method to classify A. gigas according to their geographical origin.

THE USE OF MULTIVARIATE STATISTICS TO EVALUATE THE RESPONSE OF RICE STRAW VARIETIES TO CHEMICAL TREATMENT

  • Vadiveloo, J.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제9권1호
    • /
    • pp.83-89
    • /
    • 1996
  • Multivariate statistical procedures were used to analyse data on the chemical composition and in vitro digestibility of four varienties of rice straw after treatment with 4% NaOH solution, 4% urea solution or distilled water (control) for 48 hours. For each treatment, stepwise discriminant analysis identified the variables which maximized differences between varieties and the eigenvectors from principal component analysis quantified the contribution of these criterion variables to varietal differences. The overall response of varieties to chemical treatment was demonstrated qualitatively, by cluster analysis, and quantitatively, from the magnitude of the principal component scores. The analysis revealed that the urea and control treatments elicited the same response whereas NaOH had the greatest effect on the poorest straw variety. Similar analyses conducted on the botanical fractions of the varieties showed that the relative response of the inflorescence, stem, leaf blade and leaf sheath fractions was not altered by chemical treatment.