• 제목/요약/키워드: Multivariate Data

검색결과 1,996건 처리시간 0.028초

마할라노비스 거리와 독립성분분석을 이용한 다변량 공정 고장탐지 방법에 관한 연구 (Fault Detection Method for Multivariate Process using Mahalanobis Distance and ICA)

  • 정승환;김성신
    • 한국정보전자통신기술학회논문지
    • /
    • 제14권1호
    • /
    • pp.22-28
    • /
    • 2021
  • 화학공정, 기계공정, 발전소와 같은 다변량 공정은 여러 설비들이 복잡하게 연결되어 운영되기 때문에 특정 시스템에 고장이 발생하면 전체 공정에 치명적인 영향을 미칠 수 있다. 또한, 공정 데이터는 불안정한 환경에서 계측되므로, 데이터에 이상치가 포함될 가능성이 크다. 따라서 계측된 데이터의 이상치를 제거하고 시스템의 고장을 사전에 탐지할 수 있는 모니터링 기술이 필수적이다. 본 논문에서는 여러 종류의 공정에서 고장탐지를 수행하기 위해 다이나믹 공정과 다변량 공정 모델에서 생성된 데이터를 이용하였다. 다이나믹 공정은 자기회귀 특성을 가지는 공정을 모델링한 것이고 다변량 공정은 특정 센서의 고장이 발생했을 때 상황을 묘사한 공정이다. 본 논문에서는 두 공정에서 생성된 데이터에 마할라노비스 거리를 이용하여 데이터에 포함된 이상치를 제거한 후, 독립성분분석을 적용하여 고장탐지를 수행하였다. 제안된 방법의 성능 비교를 위해 기존의 단일모델 ICA와 성능을 비교하였다. 실험결과, 제안된 방법이 기존의 ICA 보다 다이나믹 공정의 바이어스 데이터의 경우에 0.84%p, 드리프트 데이터의 경우 6.82%p 성능이 개선되었다. 다변량 공정의 경우 3.78%p 성능이 개선되었으므로, 제안된 방법이 우수한 고장탐지 성능을 보였다.

Multivariate control charts for monitoring correlation coefficients in dispersion matrix

  • Chang, Duk-Joon;Heo, Sun-Yeong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권5호
    • /
    • pp.1037-1044
    • /
    • 2012
  • Multivariate control charts for effectively monitoring every component in the dispersion matrix of multivariate normal process are considered. Through the numerical results, we noticed that the multivariate control charts based on sample statistic $V_i$ by Hotelling or $W_i$ by Alt do not work effectively when the correlation coefficient components in dispersion matrix are increased. We propose a combined procedure monitoring every component of dispersion matrix, which operates simultaneously both control charts, a chart controlling variance components and a chart controlling correlation coefficients. Our numerical results show that the proposed combined procedure is efficient for detecting changes in both variances and correlation coefficients of dispersion matrix.

Multivariate Cumulative Sum Control Chart for Dispersion Matrix

  • 장덕준;신재경
    • Journal of the Korean Data and Information Science Society
    • /
    • 제13권2호
    • /
    • pp.21-29
    • /
    • 2002
  • Several different control statistics to simultaneously monitor dispersion matrix of several quality variables are presented since different control statistics can be used to describe variability. Multivariare cumulative sum (CUSUM) control charts are proposed and the performances of the proposed CUSUM charts are evaluated in terms of average run length (ARL). Multivariate Shewhart charts are also proposed to compare the properties of the proposed CUSUM charts. The numerical results show that multivariate CUSUM charts are more efficient than multivariate Shewhart charts for small or moderate shifts. And we also found that small reference value of the CUSUM chart is more efficient for small shift.

  • PDF

Relevance of Multivariate Analysis in Management Research

  • Ojha, Sateesh Kumar
    • Journal of Information Technology Applications and Management
    • /
    • 제23권3호
    • /
    • pp.25-34
    • /
    • 2016
  • Often we receive misled conclusion in the research if properly variables are not analyzed. In different functional issues of management it is very essential that all the latent and observed variable are properly understood so management decisions will be relevant and effective. The objective of this paper is to investigate the use of different multivariate tools for analyzing in the management research : applied or basic. The sources of data is primary as well as secondary. The primary includes the observation of different research articles of the proceedings of different conferences. And the secondary includes different publications related to multivariate analysis. The study has revealed the reasons of not using such tools of research. The preliminary finding reveals that most of the researches do not use such analytical tools in a comprehensive manner. Carelessness in design while fixing the design aspect is the main reasons of not using appropriate design.

EXCEL을 이용한 다변량자료분석 시스템 개발 (A Development of Multivariate Analysis System by Using Excel)

  • 한상태;강현철;한정훈
    • 응용통계연구
    • /
    • 제17권1호
    • /
    • pp.165-172
    • /
    • 2004
  • 최근 다변량자료 분석과 관련하여 이를 시스템으로 구현하려는 연구가 다양한 각도로 이루어지고 있다. 이러한 연구들의 공통적인 특징은 일반 사용자들에게 고급 통계분석기법을 편리하게 활용할 수 있도록 GUI(Graphical User Interface) 환경의 시스템을 제공해 준 것이다. 이러한 연구들의 연장선상에서, 본 연구에서는 사회 각 분야에서 가장 널리 활용되고 있는 사무용 프로그램 인 Excel을 활용하여 시스템을 개발함으로써, 일반 사용자들도 대화식으로 다변량자료 분석을 쉽게 수행할 수 있도록 하였다.

A fast approximate fitting for mixture of multivariate skew t-distribution via EM algorithm

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • 제27권2호
    • /
    • pp.255-268
    • /
    • 2020
  • A mixture of multivariate canonical fundamental skew t-distribution (CFUST) has been of interest in various fields. In particular, interest in the unsupervised learning society is noteworthy. However, fitting the model via EM algorithm suffers from significant processing time. The main cause is due to the calculation of many multivariate t-cdfs (cumulative distribution functions) in E-step. In this article, we provide an approximate, but fast calculation method for the in univariate fashion, which is the product of successively conditional univariate t-cdfs with Taylor's first order approximation. By replacing all multivariate t-cdfs in E-step with the proposed approximate versions, we obtain the admissible results of fitting the model, where it gives 85% reduction time for the 5 dimensional skewness case of the Australian Institution Sport data set. For this approach, discussions about rough properties, advantages and limits are also presented.

주성분분석에 의한 결손 자료의 영향값 검출에 대한 연구 (Detecting Influential Observations in Multivariate Statistical Analysis of Incomplete Data by PCA)

  • 김현정;문승호;신재경
    • 응용통계연구
    • /
    • 제13권2호
    • /
    • pp.383-392
    • /
    • 2000
  • 1970년대 후반부터 영향력이 있는 관측값을 검출하기 위해서 회귀분석을 포함한 다양한 다변량 해석법에서의 영향분석 및 감도분석에 대한 연구가 진행되어 왔다. 결손 값이 포함된 불완전한 자료에 관해서도 이러한 연구가 필요하다. 이와 관련하여 Kim et al.(1998)등은 평균벡터와 분산공분산행렬에 대한 최우추정값에 초점을 두고 불완전한 자료에 대한 다변량 해석법에서의 감도분석에 관한 방법적 연구를 다루었다. Kim et al.(1998)에서는 Cook’s D 통계량을 이용하였으나, 본 논문에서는 결손값이 있는 다변량 자료에 대해서 주성분을 이용하여 영향력이 있는 관측값을 검출하는 방법에 대해서 살펴보았다. 이 때, 결손값은 EM알고리즘에 의해 대치하여 PCA 통계량을 유도하였다.

  • PDF

GEOSTATISTICAL INTEGRATION OF HIGH-RESOLUTION REMOTE SENSING DATA IN SPATIAL ESTIMATION OF GRAIN SIZE

  • Park, No-Wook;Chi, Kwang-Hoon;Jang, Dong-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume I
    • /
    • pp.406-408
    • /
    • 2006
  • Various geological thematic maps such as grain size or ground water level maps have been generated by interpolating sparsely sampled ground survey data. When there are sampled data at a limited number of locations, to use secondary information which is correlated to primary variable can help us to estimate the attribute values of the primary variable at unsampled locations. This paper applies two multivariate geostatistical algorithms to integrate remote sensing imagery with sparsely sampled ground survey data for spatial estimation of grain size: simple kriging with local means and kriging with an external drift. High-resolution IKONOS imagery which is well correlated with the grain size is used as secondary information. The algorithms are evaluated from a case study with grain size observations measured at 53 locations in the Baramarae beach of Anmyeondo, Korea. Cross validation based on a one-leave-out approach is used to compare the estimation performance of the two multivariate geostatistical algorithms with that of traditional ordinary kriging.

  • PDF

Multivariate Nonparametric Tests for Grouped and Right Censored Data

  • Park Hyo-Il;Na Jong-Hwa;Hong Seungman
    • International Journal of Reliability and Applications
    • /
    • 제6권1호
    • /
    • pp.53-64
    • /
    • 2005
  • In this paper, we propose a nonparametric test procedure for the multivariate, grouped and right censored data for two sample problem. For the construction of the test statistic, we use the linear rank statistics for each component and apply the permutation principle for obtaining the null distribution. For the large sample case, the asymptotic distribution is derived under the null hypothesis with the additional assumption that two censoring distributions are also equal. Finally, we illustrate our procedure with an example and discuss some concluding remarks. In appendices, we derive the expression of the covariance matrix and prove the asymptotic distribution.

  • PDF

A Resetting Scheme for Process Parameters using the Mahalanobis-Taguchi System

  • Park, Chang-Soon
    • 응용통계연구
    • /
    • 제25권4호
    • /
    • pp.589-603
    • /
    • 2012
  • Mahalanobis-Taguchi system(MTS) is a statistical tool for classifying the normal group and abnormal group in multivariate data structures. In addition to the classification itself, the MTS uses a method for selecting variables useful for the classification. This method can be used efficiently especially when the abnormal group data are scattered without a specific directionality. When the feedback adjustment procedure through the measurements of the process output for controlling process input variables is not practically possible, the reset procedure can be an alternative one. This article proposes a reset procedure using the MTS. Moreover, a method for identifying input variables to reset is also proposed by the use of the contribution. The identification of the root-cause parameters using the existing dimension-reduced contribution tends to be difficult due to the variety of correlation relationships of multivariate data structures. However, it became possible to provide an improved decision when used together with the location-centered contribution and the individual-parameter contribution.