• Title/Summary/Keyword: 다변량통계

Search Result 542, Processing Time 0.024 seconds

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

Detection of the Change in Blogger Sentiment using Multivariate Control Charts (다변량 관리도를 활용한 블로거 정서 변화 탐지)

  • Moon, Jeounghoon;Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.903-913
    • /
    • 2013
  • Social network services generate a considerable amount of social data every day on personal feelings or thoughts. This social data provides changing patterns of information production and consumption but are also a tool that reflects social phenomenon. We analyze negative emotional words from daily blogs to detect the change in blooger sentiment using multivariate control charts. We used the all the blogs produced between 1 January 2008 and 31 December 2009. Hotelling's T-square control chart control chart is commonly used to monitor multivariate quality characteristics; however, it assumes that quality characteristics follow multivariate normal distribution. The performance of a multivariate control chart is affected by this assumption; consequently, we introduce the support vector data description and its extension (K-control chart) suggested by Sun and Tsung (2003) and they are applied to detect the chage in blogger sentiment.

KCYP data analysis using Bayesian multivariate linear model (베이지안 다변량 선형 모형을 이용한 청소년 패널 데이터 분석)

  • Insun, Lee;Keunbaik, Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.6
    • /
    • pp.703-724
    • /
    • 2022
  • Although longitudinal studies mainly produce multivariate longitudinal data, most of existing statistical models analyze univariate longitudinal data and there is a limitation to explain complex correlations properly. Therefore, this paper describes various methods of modeling the covariance matrix to explain the complex correlations. Among them, modified Cholesky decomposition, modified Cholesky block decomposition, and hypersphere decomposition are reviewed. In this paper, we review these methods and analyze Korean children and youth panel (KCYP) data are analyzed using the Bayesian method. The KCYP data are multivariate longitudinal data that have response variables: School adaptation, academic achievement, and dependence on mobile phones. Assuming that the correlation structure and the innovation standard deviation structure are different, several models are compared. For the most suitable model, all explanatory variables are significant for school adaptation, and academic achievement and only household income appears as insignificant variables when cell phone dependence is a response variable.

Maximum likelihood estimation in multivariate structural model (다변량구조모형에서 최대우도추정)

  • 김기영
    • The Korean Journal of Applied Statistics
    • /
    • v.1 no.1
    • /
    • pp.39-44
    • /
    • 1987
  • For obtaining the m.l.e. of $\Sigma$ from p-variate Non-singular Normal parent, $N_p(\mu, \Sigma)$, Andersous' Procedure based on the invariance property of the m.l.e. seems to be generally Preferred in the view of its simplicity. This paper shows that his approach with respect to $\Sigma^{-1}$ rather than $\Sigma$ itself, he burther applicable to deriving the m.l.e. of parametersinvolved in the common factor model an dsimplex model as well.

Bayesian control problem in multivariate mixture model (다변량 혼합모형에서 통계적 제어문제의 베이지안적 고찰)

  • 이석훈;박래현;최종석
    • The Korean Journal of Applied Statistics
    • /
    • v.3 no.2
    • /
    • pp.27-37
    • /
    • 1990
  • We consider the statistical control problem for the mixture model in which one can choose the values of independent variables that produce the values of the dependent variables as close to the target values as possible. The theory suggested for the problem is reviewed and an extended model with respect to the assumption of variance and the number of dependent variables is suggested. A Basyesian treatment is studied for the above problem with example as an illustration.

  • PDF

DD-plot for Detecting the Out-of-Control State in Multivariate Process (다변량공정에서 이상상태를 탐지하기 위한 DD-plot)

  • Jang, Dae-Heung;Yi, Seongbaek;Kim, Youngil
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.2
    • /
    • pp.281-290
    • /
    • 2013
  • It is well known that the DD-plot is a useful graphical tool for non-parametric classification. In this paper, we propose another use of DD-plot for detecting the out-of-control state in multivariate process. We suggested a dynamic version of DD-plot and its accompanying a quality index plot in such case.

Application of Multivariate Statistical Analysis Technique in Landfill Investigation (매립물 특성 조사를 위한 다변량 통계분석 기법의 응용)

  • Kwon, Byung-Doo;Kim, Cha-Soup
    • Journal of the Korean earth science society
    • /
    • v.18 no.6
    • /
    • pp.515-521
    • /
    • 1997
  • To investigate the nature of the waste materials in the Nanjido Landfill, we have conducted multivariate statistical analysis of geophysical data set comprised of magnetic, gravity, LandSat TM thermal band and surface depression measurement data. Because these data sets show different responses to the depth, we have transformed the observed total field magnetic data and gravity data to the residual reduced-to-pole(RTP) magnetic anomalies and the three dimensional density anomalies, respectively, and utilized the informations about the upper shallow part of the landfills only in the following process. For the statistical analysis at the points of depression measurement, the magnetic, density and LandSat data values at these points are determined by interpolation process. Since the multivarite statistical analysis technique utilizes a clustering algorithm for classification of data set and we have measured the dissimilarity between objects by using Euclidean distance, standardization was applied prior to distance calculation in order to eliminate any scaling effects due to different measurement unit of each data set. The hierarchial grouping technique was used to construct the dendrogram. The optimum number of statistical groups(clusters), which are classified on the basis of geophysical and geotechnical characteristics, appeared to be six on the resulting dendrogram. The result of this study suggests that the dimension and nature of the multicomponent waste landfills can be identified by application of the multivarite statistical analysis technique to integrated geophysical data sets.

  • PDF

An Alternating Approach of Maximum Likelihood Estimation for Mixture of Multivariate Skew t-Distribution (치우친 다변량 t-분포 혼합모형에 대한 최우추정)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.819-831
    • /
    • 2014
  • The Exact-EM algorithm can conventionally fit a mixture of multivariate skew distribution. However, it suffers from highly expensive computational costs to calculate the moments of multivariate truncated t-distribution in E-step. This paper proposes a new SPU-EM method that adopts the AECM algorithm principle proposed by Meng and van Dyk (1997)'s to circumvent the multi-dimensionality of the moments. This method offers a shorter execution time than a conventional Exact-EM algorithm. Some experments are provided to show its effectiveness.

Choice of frequency via principal component in high-frequency multivariate volatility models (주성분을 이용한 다변량 고빈도 실현 변동성의 주기 선택)

  • Jin, M.K.;Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.747-757
    • /
    • 2017
  • We investigate multivariate volatilities based on high frequency time series. The PCA (principal component analysis) method is employed to achieve a dimension reduction in multivariate volatility. Multivariate realized volatilities (RV) with various frequencies are calculated from high frequency data and "optimum" frequency is suggested using PCA. Specifically, RVs with various frequencies are compared with existing daily volatilities such as Cholesky, EWMA and BEKK after dimension reduction via PCA. An analysis of high frequency stock prices of KOSPI, Samsung Electronics and Hyundai motor company is illustrated.

A Comparative Study on the Multivariate Thomas-Fiering and Matalas Model (다변량 Thomas-Fiering 모형과 Matalas 모형의 비교연구)

  • 이주헌;이은태
    • Water for future
    • /
    • v.24 no.4
    • /
    • pp.59-66
    • /
    • 1991
  • Abstract The purpose of the synthetic of monthly river flows based on the short-term observed data by means of multivariate stochastic models is to provide abundunt input data to the water resources systems of which the system performance and operation policy are to be determined beforehand. In this study, multivariate Thomas-Fiering and Matalas models for synthetic generation based on stream flows in neihboring basin were employed to check if it can be applide in the modeling of monthly flows. Statistical parameters estimated by Method of Moment and Fourier Series Analysis respectively were reproduced for statistical features. For comparisons the statistical parameters of the generated monthly flow by each model were compared with those of the observed monthly flows. Results of this study suggest that the application of Matalas model for synthetic generation of monthly river flows can be adapted.

  • PDF