• 제목/요약/키워드: multivariate data analysis

검색결과 1,411건 처리시간 0.034초

Box-Cox 대비변환을 이용한 구성비율자료의 주성분분석 (Principal Component Analysis of Compositional Data using Box-Cox Contrast Transformation)

  • 최병진;김기영
    • 응용통계연구
    • /
    • 제14권1호
    • /
    • pp.137-148
    • /
    • 2001
  • 비율을 나타내는 요소들로 이루어진 구성비율자료는 각 행들의 합이 1이 되는 제약을 가지고 있어 통계적으로 다루기가 쉽지 않다. 더구나 자료의 구조가 선형적인 형태를 보이지 않는 특성을 가지기 때문에 주성분분석과 같은 선형적인 다변량기법들을 구성비율자료에 적용을 할 때 잘못된 해석과 추론이 이루어질 가능성이 있다. 본 논문에서는 구성비율자료의 주성분분석에서 기존의 방법들이 가지는 문제점을 해결하기 위해 Box-Cox 대비변환(Box-Cox contrast transformation)을 이용한 새로운 형태의 분석방법을 제시한다. 그리고 실제자료의 분석과 모의실험을 통해서 Aitchison(1983)이 제시한 방법과 수행능력을 비교하고자 한다.

  • PDF

금강유역 14개 관측점의 수질자료를 이용한 수질의 다변량분석 (Multivariate Analysis of Water Quality Data at 14 Stations in the Geum-River Watershed)

  • 임창수
    • 한국환경과학회지
    • /
    • 제8권3호
    • /
    • pp.331-336
    • /
    • 1999
  • The monthly water quality data measured at 14 stations located in the Geum-River watershed were clustered into 2 to 7 clusters. Furthermore, factor analyses were conducted on Gabcheon and Yugucheon to characterize the water qualtiy, based on the information obtained from the results of culster analysis. The results of cluster analysis show that the water quality charactersitic of main stream of the Geum-River is somewhat different from that of substream of the Geum-River. Furthermore, the water quality characteristic of Gabcheon which is expected to have the most serious water quality problems in the Geum-River watershed shows the most different water quality characteristic from Yugucheon. Based ont he factor loadings in each factor, Gabcheon and Yugucheon have their own water quality characteristics. This is mainly because of composite factors such as different population density, industrial activities, and land use conditions in Gabcheon and Yugucheon subwatersheds.

  • PDF

Issues Related to the Use of Time Series in Model Building and Analysis: Review Article

  • Wei, William W.S.
    • Communications for Statistical Applications and Methods
    • /
    • 제22권3호
    • /
    • pp.209-222
    • /
    • 2015
  • Time series are used in many studies for model building and analysis. We must be very careful to understand the kind of time series data used in the analysis. In this review article, we will begin with some issues related to the use of aggregate and systematic sampling time series. Since several time series are often used in a study of the relationship of variables, we will also consider vector time series modeling and analysis. Although the basic procedures of model building between univariate time series and vector time series are the same, there are some important phenomena which are unique to vector time series. Therefore, we will also discuss some issues related to vector time models. Understanding these issues is important when we use time series data in modeling and analysis, regardless of whether it is a univariate or multivariate time series.

다변량회귀에서 주선택 반응변수 차원축소 (Principal selected response reduction in multivariate regression)

  • 유재근
    • 응용통계연구
    • /
    • 제34권4호
    • /
    • pp.659-669
    • /
    • 2021
  • 다변량 회귀분석은 경시적 자료분석이나 함수적 자료분석 등 다양한 분야에서 빈번하게 사용되는 통계적 방법론이다. 다변량 회귀분석은 설명변수의 차원 뿐만 아니라 반응변수의 차원때문에 일변량 회귀분석에서 보다 차원의 저주문제에 더 강한 영향을 받는다. 이러한 문제를 해결하기 위해 최근 Yoo (2018)와 Yoo (2019a)에 세 가지 모형기반 반응변수 차원축소 방법이 제시되었다. 하지만 Yoo (2019a)에서 제시한 기본 방법은 모의실험 결과 모형에 가장 영향을 덜 받지만, 다른 두 방법 중 더 나은 방법보다 더 좋은 추정결과를 제시하지 못한다. 이러한 단점을 극복하기 위해 본 논문에서는 기본 방법의 결과 다른 두 방법의 결과를 비교하여, 자료에 따라 최선의 방법을 제시하는 선택 알고리듬을 제시하고, 이를 주선택 반응변수 차원축소라 명명한다. 다양한 모의실험 결과 주선택 반응변수 차원축소는 Yoo (2019a)의 기본방법보다 더 정확하게 차원을 축소하고, 모든 경우에 있더 더 바람직한 방법을 선택함을 확인할 수 있다. 이러한 결과로 제안한 주선택 반응변수의 차원축소 방법의 실제적 유용성을 확인할 수 있다.

독립성분분석을 이용한 다변량 공정에서의 고장탐지 방법 (Fault Detection Method for Multivariate Process using ICA)

  • 정승환;김민석;이한수;김종근;김성신
    • 한국정보통신학회논문지
    • /
    • 제24권2호
    • /
    • pp.192-197
    • /
    • 2020
  • 대규모 발전소나 화학공정과 같은 다변량 공정은 매우 위험한 환경에서 운전되기 때문에 고장이 발생하면 심각한 인적·물적 손실이 발생할 수 있다. 따라서 시스템의 고장을 사전에 탐지할 수 있는 온라인 모니터링 기술이 필수적이다. 본 논문에서는 세 가지의 다른 다변량 공정 데이터에 ICA를 적용하여 고장탐지를 수행하였고, PCA와 성능을 비교하였다. ICA 기반의 고장탐지 절차는 크게 오프라인 과정과 온라인 과정으로 나뉜다. 오프라인 과정에서는 시스템이 정상일 때 계측된 데이터를 이용하여 고장판별을 위한 문턱 값을 설정한다. 그리고 온라인 과정에서는 실시간으로 계측되는 질의벡터에 대한 통계량을 계산한 후, 계산된 통계량과 사전에 정의된 문턱 값과 비교하여 고장을 판별한다. 본 논문에서 이용한 세 가지의 다변량 공정 데이터에 실험한 결과, ICA 기반 고장탐지 방법이 시스템의 고장을 사전에 탐지하였고, PCA 보다 우수한 고장탐지 성능을 보여주었다.

남성복의 치수규격을 위한 체형분류(제3보) -사진자료에 의한 동체부의 분류- (Classification of Bodytype on Adult Male for the Apparel Sizing System (Part 3) -Bodytype of Trunk from the Photoqraphic Data-)

  • 김구자
    • 한국의류학회지
    • /
    • 제19권6호
    • /
    • pp.924-932
    • /
    • 1995
  • Concept of the comfort and fitness has become a major concern in the basic function of the ready.made clothes. Until now ready-made clothes were not made by on the basis of the bodytype, but by the body size only This research was performed to classify and characterize the bodytypes of Korean adult males. Sample size was 1290 subjects and their age range was from 19 to 54 years old. 25 variables from the photographic data were applied to analyze the bodytype of trunk. Data were analyzed by the multivariate method, especially factor and cluster analysis. The groups forming a cluster can be subdivided into 5 sets by crosstabulation extracted by the hierarchical cluster analysis. 5 bodytypes classified by the photographic sources could be combined with the anthropometric data and were demonstrated with 5 silhouette. Type 3 and 4 in trunk were dominant and were composed of the majority of 55.6% of the subjects. Bodytypes of Korean males were influenced by the degree of posture erectness and of curvature of the front side of the body in waist and abdomen.

  • PDF

K-means 알고리즘 기반 클러스터링 인덱스 비교 연구 (A Performance Comparison of Cluster Validity Indices based on K-means Algorithm)

  • 심요성;정지원;최인찬
    • Asia pacific journal of information systems
    • /
    • 제16권1호
    • /
    • pp.127-144
    • /
    • 2006
  • The K-means algorithm is widely used at the initial stage of data analysis in data mining process, partly because of its low time complexity and the simplicity of practical implementation. Cluster validity indices are used along with the algorithm in order to determine the number of clusters as well as the clustering results of datasets. In this paper, we present a performance comparison of sixteen indices, which are selected from forty indices in literature, while considering their applicability to nonhierarchical clustering algorithms. Data sets used in the experiment are generated based on multivariate normal distribution. In particular, four error types including standardization, outlier generation, error perturbation, and noise dimension addition are considered in the comparison. Through the experiment the effects of varying number of points, attributes, and clusters on the performance are analyzed. The result of the simulation experiment shows that Calinski and Harabasz index performs the best through the all datasets and that Davis and Bouldin index becomes a strong competitor as the number of points increases in dataset.

산업제품의 설계응용을 위한 한국인 인체측정자료 관리 시스템(ADaM)의 개발 (Development of an Anthropometric Data Manager(ADaM) based on the 1992 National Anthropometric survey)

  • 김진호;윤정선;박수찬;김창범
    • 산업공학
    • /
    • 제8권1호
    • /
    • pp.15-21
    • /
    • 1995
  • Since anthropometric data are essential to the design of industrial products, the national anthropometric survey was performed three times in Korea. An Anthropometric Data Manager system (ADaM) was implemented based on the 1992 national anthropometric survey to promote the utilization of the data. The system provides graphic user interface to facilitate usability. Anthropometric information can be obtained in various ways by the following statistical analyses; multivariate features analysis, correlation analysis, and regression. In addition, recommendations for design parameters of industrial products were provided in this system.

  • PDF

Estimation of water quality distribution in freshing reservoir by satellite images

  • Torii, Kiyoshi;You, Jenn-Ming;Chiba, Satoshi;Cheng, Ke-Sheng
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.1227-1229
    • /
    • 2003
  • Kojima Lake in Okayama prefecture is a freshing reservoir constructed adjacent to the oldest reclaimed land in Japan. This lake has a serious water quality problem because two urban rivers are flowing into it. In the present study, unsupervised classification was performed at intervals of several years using Landsat MSS data in the past 15 years. After geometric correction of these data, MSS data corresponding geographically to the field observation data were extracted and subjected to the multivariate analysis. Water quality distribution in the lake was estimated using the regression equation obtained as a result. In addition, two - dimensional and three-dimensional numerical simulations were performed and compared with the distribution obtained from the satellite images. Behavior of the reservoir flows is complicated and water quality distribution varies greatly with the flows. Here, I report the results of analysis on three factors, field observation, numerical simulation and satellite images.

  • PDF

내과 환자의 중환자실 전동에 대한 위험요인 분석 (Analysis of Risk Factors to Predict Intensive Care Unit Transfer in Medical in-Patients)

  • 이주리;최혜란
    • Journal of Korean Biological Nursing Science
    • /
    • 제16권4호
    • /
    • pp.259-266
    • /
    • 2014
  • Purpose: The purpose of this study was to analyze risk factors in predicting medical patients transferred to Intensive Care Unit (ICU) on the general ward. Methods: We reviewed retrospectively clinical data of 120 medical patients on the general ward and a Modified Early Warning Score (MEWS) between ICU group and general ward group. Data were analyzed with multivariate logistic regression and the area under the receiver operating characteristic curves using SPSS/WIN 18.0 program. Results: Fifty-two ICU patients and 68 general ward patients were included. In multivariate logistic regression, the MEWSs (Odds Ratio [OR], 1.91; 95% confidence interval [CI], 1.32-2.76), sequential organ failure assessment score (OR, 1.28; 95% CI, 1.10-1.72), $PaO_2/FiO_2$ ratio (OR, 0.98; 95% CI, 0.98-0.99), and saturation (OR, 0.93; 95% CI, 0.88-0.99) were predictive of ICU transfer. The sensitivity and the specificity of the MEWSs used with a cut-off value of six were 80.8% and 70.6% respectively for ICU transfer. Conclusion: These findings suggest that early prediction and treatment of patients with high risk of ICU transfer may improve the prognosis of patients.