• 제목/요약/키워드: multivariate analysis

검색결과 3,128건 처리시간 0.028초

Canonical Correlation Biplot

  • Park, Mi-Ra;Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • 제3권1호
    • /
    • pp.11-19
    • /
    • 1996
  • Canonical correlation analysis is a multivariate technique for identifying and quantifying the statistical relationship between two sets of variables. Like most multivariate techniques, the main objective of canonical correlation analysis is to reduce the dimensionality of the dataset. It would be particularly useful if high dimensional data can be represented in a low dimensional space. In this study, we will construct statistical graphs for paired sets of multivariate data. Specifically, plots of the observations as well as the variables are proposed. We discuss the geometric interpretation and goodness-of-fit of the proposed plots. We also provide a numerical example.

  • PDF

Rank Tests for Multivariate Linear Models in the Presence of Missing Data

  • Lee, Jae-Won;David M. Reboussin
    • Journal of the Korean Statistical Society
    • /
    • 제26권3호
    • /
    • pp.319-332
    • /
    • 1997
  • The application of multivariate linear rank statistics to data with item nonresponse is considered. Only a modest extension of the complete data techniques is required when the missing data may be thought of as a random sample, and an appropriate modification of the covariances is derived. A proof of the asymptotic multivariate normality is given. A review of some related results in the literature is presented and applications including longitudinal and repeated measures designs are discussed.

  • PDF

Analyzing Operation Deviation in the Deasphalting Process Using Multivariate Statistics Analysis Method

  • Park, Joo-Hwang;Kim, Jong-Soo;Kim, Tai-Suk
    • 한국멀티미디어학회논문지
    • /
    • 제17권7호
    • /
    • pp.858-865
    • /
    • 2014
  • In the case of system like MES, various sensors collect the data in real time and save it as a big data to monitor the process. However, if there is big data mining in distributed computing system, whole processing process can be improved. In this paper, system to analyze the cause of operation deviation was built using the big data which has been collected from deasphalting process at the two different plants. By applying multivariate statistical analysis to the big data which has been collected through MES(Manufacturing Execution System), main cause of operation deviation was analyzed. We present the example of analyzing the operation deviation of deasphalting process using the big data which collected from MES by using multivariate statistics analysis method. As a result of regression analysis of the forward stepwise method, regression equation has been found which can explain 52% increase of performance compare to existing model. Through this suggested method, the existing petrochemical process can be replaced which is manual analysis method and has the risk of being subjective according to the tester. The new method can provide the objective analysis method based on numbers and statistic.

A Goodness-of-Fit Test for Multivariate Normal Distribution Using Modified Squared Distance

  • Yim, Mi-Hong;Park, Hyun-Jung;Kim, Joo-Han
    • Communications for Statistical Applications and Methods
    • /
    • 제19권4호
    • /
    • pp.607-617
    • /
    • 2012
  • The goodness-of-fit test for multivariate normal distribution is important because most multivariate statistical methods are based on the assumption of multivariate normality. We propose goodness-of-fit test statistics for multivariate normality based on the modified squared distance. The empirical percentage points of the null distribution of the proposed statistics are presented via numerical simulations. We compare performance of several test statistics through a Monte Carlo simulation.

A Comparison Study of Multivariate Binary and Continuous Outcomes

  • Pak, Dae-Woo;Cho, Hyung-Jun
    • 응용통계연구
    • /
    • 제25권4호
    • /
    • pp.605-612
    • /
    • 2012
  • Multivariate data are often generated with multiple outcomes in various fields. Multiple outcomes could be mixed as continuous and discrete. Because of their complexity, the data are often dealt with by separately applying regression analysis to each outcome even though they are associated the each other. This univariate approach results in the low efficiency of estimates for parameters. We study the efficiency gains of the multivariate approaches relative to the univariate approach with the mixed data that include continuous and binary outcomes. All approaches yield consistent estimates for parameters with complete data. By jointly estimating parameters using multivariate methods, it is generally possible to obtain more accurate estimates for parameters than by a univariate approach. The association between continuous and binary outcomes creates a gap in efficiency between multivariate and univariate approaches. We provide a guidance to analyze the mixed data.

주성분분석에 의한 결손 자료의 영향값 검출에 대한 연구 (Detecting Influential Observations in Multivariate Statistical Analysis of Incomplete Data by PCA)

  • 김현정;문승호;신재경
    • 응용통계연구
    • /
    • 제13권2호
    • /
    • pp.383-392
    • /
    • 2000
  • 1970년대 후반부터 영향력이 있는 관측값을 검출하기 위해서 회귀분석을 포함한 다양한 다변량 해석법에서의 영향분석 및 감도분석에 대한 연구가 진행되어 왔다. 결손 값이 포함된 불완전한 자료에 관해서도 이러한 연구가 필요하다. 이와 관련하여 Kim et al.(1998)등은 평균벡터와 분산공분산행렬에 대한 최우추정값에 초점을 두고 불완전한 자료에 대한 다변량 해석법에서의 감도분석에 관한 방법적 연구를 다루었다. Kim et al.(1998)에서는 Cook’s D 통계량을 이용하였으나, 본 논문에서는 결손값이 있는 다변량 자료에 대해서 주성분을 이용하여 영향력이 있는 관측값을 검출하는 방법에 대해서 살펴보았다. 이 때, 결손값은 EM알고리즘에 의해 대치하여 PCA 통계량을 유도하였다.

  • PDF

순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측 (Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances)

  • 이범석;김성영
    • 한국가스학회지
    • /
    • 제11권3호
    • /
    • pp.13-18
    • /
    • 2007
  • 다변량 통계 분석법(Multivariate statistical analysis method)의 대표적 방법인 다중 선형 회귀법(Multiple linear regression. MLR)을 이용하여 2성분계 혼합물의 인화점을 회귀 분석하고 예측하였다. 가연성 물질의 인화점에 대한 예측은 실제 화학 공정 설계에서 화재 및 폭발 위험성을 판단하는 중요한 부분 중의 하나이다. 본 연구에서는 순수 성분의 물성 자료만을 이용하여 2성분계 혼합물의 인화점 실험 자료에 대해 다중 선형 회귀법(MLR)을 수행하였고, 이를 이용하여 새로운 혼합물에 대한 인화점을 예측하였다. 2성분계 혼합물의 인화점에 대한 MLR의 회귀 성능과 새로운 혼합물에 대한 예측 성능을 알아보기 위해, 기존의 인화점 추정 방법인 Raoult의 법칙과 Van Laar식에 의한 추정값과 비교해 보았다.

  • PDF

다변량분석법을 이용한 충청북도 읍면단위 농촌계획 수립을 위한 지역유형구분 분석 (A Classification of Regional Pattern Analysis for the Planning in Chungbuk using Multivariate Analysis)

  • 윤성수;주호길
    • 농촌계획
    • /
    • 제11권2호
    • /
    • pp.35-41
    • /
    • 2005
  • It is necessary that the basic concept of rural planning update from economics based on the production and sale into experience of natural resources and traditional culture. For the purpose of set up development direction for rural district, it is requisite to the multivariate analysis. In this study, the methods of the classification of rural village with existing data are studied, the results looking for applying to the making of principal viewpoint of the development. The analysis methods of classification are used the PCA, CA and combination of these, and making the revised method for localization of the rural district. In this study, we implement classification of regional pattern analysis for the planning of rural district in Chungbuk province.

Box-Cox변환을 이용한 다변량 공정능력 분석 (Analysis of Multivariate Process Capability Using Box-Cox Transformation)

  • 문혜진;정영배
    • 산업경영시스템학회지
    • /
    • 제42권2호
    • /
    • pp.18-27
    • /
    • 2019
  • The process control methods based on the statistical analysis apply the analysis method or mathematical model under the assumption that the process characteristic is normally distributed. However, the distribution of data collected by the automatic measurement system in real time is often not followed by normal distribution. As the statistical analysis tools, the process capability index (PCI) has been used a lot as a measure of process capability analysis in the production site. However, PCI has been usually used without checking the normality test for the process data. Even though the normality assumption is violated, if the analysis method under the assumption of the normal distribution is performed, this will be an incorrect result and take a wrong action. When the normality assumption is violated, we can transform the non-normal data into the normal data by using an appropriate normal transformation method. There are various methods of the normal transformation. In this paper, we consider the Box-Cox transformation among them. Hence, the purpose of the study is to expand the analysis method for the multivariate process capability index using Box-Cox transformation. This study proposes the multivariate process capability index to be able to use according to both methodologies whether data is normally distributed or not. Through the computational examples, we compare and discuss the multivariate process capability index between before and after Box-Cox transformation when the process data is not normally distributed.

Multivariate analysis of longitudinal surveys for population median

  • Priyanka, Kumari;Mittal, Richa
    • Communications for Statistical Applications and Methods
    • /
    • 제24권3호
    • /
    • pp.255-269
    • /
    • 2017
  • This article explores the analysis of longitudinal surveys in which same units are investigated on several occasions. Multivariate exponential ratio type estimator has been proposed for the estimation of the finite population median at the current occasion in two occasion longitudinal surveys. Information on several additional auxiliary variables, which are stable over time and readily available on both the occasions, has been utilized. Properties of the proposed multivariate estimator, including the optimum replacement strategy, are presented. The proposed multivariate estimator is compared with the sample median estimator when there is no matching from a previous occasion and with the exponential ratio type estimator in successive sampling when information is available on only one additional auxiliary variable. The merits of the proposed estimator are justified by empirical interpretations and validated by a simulation study with the help of some natural populations.