• Title/Summary/Keyword: 다변량통계분석

Search Result 476, Processing Time 0.029 seconds

A Study on the Use of Cluster Analysis for Multivariate and Multipurpose Stratification (군집분석을 이용한 다목적 조사의 층화에 관한 연구)

  • Park, Jin-Woo;Yun, Seok-Hoon;Kim, Jin-Heum;Jeong, Hyeong-Chul
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.2
    • /
    • pp.387-394
    • /
    • 2007
  • This paper considers several stratification strategies for multivariate and multipurpose survey with several quantitative stratification variables. We propose three methods of stratification based on, respectively, the method of cumulative frequency square root which is the most popular one in univariate stratification, cluster analysis, and factor analysis followed by cluster analysis. We then compare the efficiency of those methods using the Dong-Eup-Myun data of the holding numbers of farming machines, extracted from the 2001 Agricultural Census. It turned out that the method based on cluster analysis with factor analysis would be a relatively satisfactory strategy.

다변량 분석기법을 이용한 재해통계 분석

  • 고병인;임현교
    • Proceedings of the Korean Institute of Industrial Safety Conference
    • /
    • 1999.06a
    • /
    • pp.133-136
    • /
    • 1999
  • 국내의 산업재해 통계 산출방법은 재해자가 제출한 요양신청서 중 업무상 재해로 인정된 재해만을 대상으로 통계를 산출하고 있고, 산업재해발생에 대한 원인분석도 재해발생형태, 기인물, 관리적 원인, 불안전행동, 불안전 상태등의 단순 빈도에 대해서만 행해지고 있다. 이것은 재해건수 감소에 목표를 집중시킨 결과로서 효율적인 안전관리가 실시되지 않고 있는 이유이고 또 그 목적을 충족시키기에는 미흡하고, 근본적인 재해발생 원인 규명에도 한계가 있다. (중략)

  • PDF

Feature Extraction of CNN-GRU based Multivariate Time Series Data for Regional Clustering (지역 군집화를 위한 CNN-GRU 기반 다변량 시계열 데이터의 특성 추출)

  • Kim, Jinah;Lee, Ji-Hoon;Choi, Dong-Wook;Moon, Nammee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.950-951
    • /
    • 2019
  • 시계열 데이터에 대한 군집화 관련 연구는 주로 통계 분석을 통해 이뤄지기 때문에 데이터가 갖는 특성을 완전히 반영하는 데 한계를 갖는다. 본 논문에서는 다변량 데이터에서의 군집화를 위하여 변수별로 시간에 따른 변화와 특징을 추출하기 위한 CNN-GRU(Convolutional Neural Network - Gated Recurrent Unit) 기반의 신경망 모델을 제안한다. CNN을 활용하여 변수별로 갖는 특성을 파악하고자 하였으며, GRU을 통해 전체 시간에 따른 소비 추세를 도출하고자 하였다. 지역별로 업종에 따라 사용된 2년 치의 실제 카드 데이터를 활용하였으며, 유사한 소비 추세를 보이는 지역을 군집화하는데 이를 적용하였다. 결과적으로, 다변량 시계열 데이터를 통해 전체적인 흐름을 반영하여 패턴화했다는 점에서 의의를 갖는다.

Rapid discrimination system of Chinese cabbage (Brassica rapa) at metabolic level using Fourier transform infrared spectroscopy (FT-IR) based on multivariate analysis (배추 대사체 추출물의 FT-IR 스펙트럼 및 다변량 통계분석을 통한 계통 신속 식별 체계)

  • Ahn, Myung Suk;Lim, Chan Ju;Song, Seung Yeob;Min, Sung Ran;Lee, In Ho;Nou, Ill-Sup;Kim, Suk Weon
    • Journal of Plant Biotechnology
    • /
    • v.43 no.3
    • /
    • pp.383-390
    • /
    • 2016
  • To determine whether FT-IR spectral analysis based on multivariate analysis could be used to discriminate Chinese cabbage breeding line at metabolic level, whole cell extracts of nine different breeding lines (three paternal, three maternal and three $F_1$ lines) were subjected to Fourier transform infrared spectroscopy (FT-IR). FT-IR spectral data of Chinese cabbage plants were analyzed by principal component analysis (PCA), partial least square discriminant analysis (PLS-DA), and hierarchical clustering analysis (HCA). The hierarchical dendrograms based on PLS-DA from two of three cross combinations showed that paternal, maternal, and their progeny $F_1$ lines samples were perfectly separated into three branches in breeding line dependent manner. However, a cross combination failed to fully discriminate them into three branches. Thus, hierarchical dendrograms based on PLS-DA of FT-IR spectral data of Chinese cabbage breeding lines could be used to represent the most probable chemotaxonomical relationship among maternal, paternal, and $F_1$ plants. Furthermore, these metabolic discrimination systems could be applied for rapid selection and classification of useful Chinese cabbage cultivars.

VaR Estimation of Multivariate Distribution Using Copula Functions (Copula 함수를 이용한 이변량분포의 VaR 추정)

  • Hong, Chong-Sun;Lee, Jae-Hyung
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.3
    • /
    • pp.523-533
    • /
    • 2011
  • Most nancial preference methods for market risk management are to estimate VaR. In many real cases, it happens to obtain the VaRs of the univariate as well as multivariate distributions based on multivariate data. Copula functions are used to explore the dependence of non-normal random variables and generate the corresponding multivariate distribution functions in this work. We estimate Archimedian Copula functions including Clayton Copula, Gumbel Copula, Frank Copula that are tted to the multivariate earning rate distribution, and then obtain their VaRs. With these Copula functions, we estimate the VaRs of both a certain integrated industry and individual industries. The parameters of three kinds of Copula functions are estimated for an illustrated stock data of two Korean industries to obtain the VaR of the bivariate distribution and those of the corresponding univariate distributions. These VaRs are compared with those obtained from other methods to discuss the accuracy of the estimations.

Analysis of biodiesel quality based on infrared spectroscopy and multivariate statistics (적외선 분광분석과 다변량 통계에 기반한 바이오디젤 품질분석)

  • Kim, Hye-Sil;Cho, Hyun-Woo;Liu, J. Jay
    • Analytical Science and Technology
    • /
    • v.25 no.4
    • /
    • pp.214-222
    • /
    • 2012
  • ASTM (American Society for Testing and Materials) D6751-10 suggests analytical methods as well as specifications for biodiesel quality. However, it is expensive and time-consuming to follow the ASTM testing methods to analyze biodiesel and various impurities. This paper develops a quantitative analysis system for biodiesel and impurities based on Infrared spectroscopy and a multivariate statistical method, PLS (partial least squares). In addition, four different pre-processing techniques were compared for spectrum correction and noise reduction. Savitzky-Golay pre-processing showed the best performance.

Multivariate Region Growing Method with Image Segments (영상분할단위 기반의 다변량 영역확장기법)

  • 이종열
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2004.03a
    • /
    • pp.273-278
    • /
    • 2004
  • Feature identification is one of the largest issue in high spatial resolution satellite imagery. A popular method associated with this feature identification is image segmentation to produce image segments that are more likely to features interested. Here, it is, proposed that combination of edge extraction and region growing methods for image segments were used to improve the result of image segmentation. At the intial step, an image was segmented by edge detection method. The segments were assigned IDs, and polygon topology of segments were built. Based on the topology, the segments were tested their similarities with adjacent segments using multivariate analysis. The segments that have similar spectral characteristics were merged into a region. The test application shows that the segments composed of individual large, spectrally homogeneous structures, such as buildings and roads, were merged into more similar shape of structures.

  • PDF

Missing Value Estimation and Sensor Fault Identification using Multivariate Statistical Analysis (다변량 통계 분석을 이용한 결측 데이터의 예측과 센서이상 확인)

  • Lee, Changkyu;Lee, In-Beum
    • Korean Chemical Engineering Research
    • /
    • v.45 no.1
    • /
    • pp.87-92
    • /
    • 2007
  • Recently, developments of process monitoring system in order to detect and diagnose process abnormalities has got the spotlight in process systems engineering. Normal data obtained from processes provide available information of process characteristics to be used for modeling, monitoring, and control. Since modern chemical and environmental processes have high dimensionality, strong correlation, severe dynamics and nonlinearity, it is not easy to analyze a process through model-based approach. To overcome limitations of model-based approach, lots of system engineers and academic researchers have focused on statistical approach combined with multivariable analysis such as principal component analysis (PCA), partial least squares (PLS), and so on. Several multivariate analysis methods have been modified to apply it to a chemical process with specific characteristics such as dynamics, nonlinearity, and so on.This paper discusses about missing value estimation and sensor fault identification based on process variable reconstruction using dynamic PCA and canonical variate analysis.

Enhancing Visualization in Self-Organizing Maps (SOM에서 개체의 시각화)

  • Um Ick-Hyun;Huh Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.83-98
    • /
    • 2005
  • Exploring distributional patterns of multivariate data is very essential in understanding the characteristics of given data set, as well as in building plausible models for the data. For that purpose, low-dimensional visualization methods have been developed by many researchers along various directions. As one of methods, Kohonen's SOM (Self-Organizing Map) is prominent. SOM compresses the volume of the data, yields abstraction from the data and offers visual display on low-dimensional grids. Although it is proven quite effective, it has one undesirable property: SOM's display is discrete. In this study, we propose two techniques for enhancing quality of SOM's display, so that SOM's display becomes continuous. The proposed methods are demonstrated in two numerical examples.

서평 : 윤기중 저, 수리통계학, 서울 : 박영사, 1974

  • 백운붕
    • Journal of the Korean Statistical Society
    • /
    • v.3 no.1
    • /
    • pp.65-66
    • /
    • 1974
  • 통계학의 수리론을 전개한 우리의 저서가 별로 없는 터에 윤기중교수의 '수리통계학'이 박영사를 통하여 간행되었다. 이책은 미적분에 관한 수학지식으로 능히 독파할 수 있도록 순차적으로 차분하게 기술되어 있다. 집합론의 개념에서부터 시작하여 확률론의 기초사항을 친절하게 설명하고 연속확률변수의 분포, 확률표본, 점추정, 다변량정규분포, 각종 통계량의 분포, 통계적 가설검정, 구간추정, 그리고 끝으로 회귀와 상관분석에 이르기까지 각종항목에 걸쳐서 통계학이론이 빠짐없이 기술되어 있다.

  • PDF