• 제목/요약/키워드: Multivariate Statistical Method

Search Result 295, Processing Time 0.025 seconds

The Use of Local Outlier Factor(LOF) for Improving Performance of Independent Component Analysis(ICA) based Statistical Process Control(SPC) (LOF를 이용한 ICA 기반 통계적 공정관리의 성능 개선 방법론)

  • Lee, Jae-Shin;Kang, Bok-Young;Kang, Suk-Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.36 no.1
    • /
    • pp.39-55
    • /
    • 2011
  • Process monitoring has been emphasized for the monitoring of complex system such as chemical processing industries to achieve the efficiency enhancement, quality management, safety improvement. Recently, ICA (Independent Component Analysis) based MSPC (Multivariate Statistical Process Control) was widely used in process monitoring approaches. Moreover, DICA (Dynamic ICA) has been introduced to consider the system dynamics. However, the existing approaches show the limitation that their performances are strongly dependent on the statistical distributions of control variables. To improve the limitation, we propose a novel approach for process monitoring by integrating DICA and LOF (Local Outlier Factor). In this paper, we aim to improve the fault detection rate with the proposed method. LOF detects local outliers by using density of surrounding space so that its performance is regardless of data distribution. Therefore, the proposed method not only can consider the system dynamics but can also assure robust performance regardless of the statistical distributions of control variables. Comparison experiments were conducted on the widely used benchmark dataset, Tennessee Eastman process (TE process), and showed the improved performance than existing approaches.

A Bayesian Model-based Clustering with Dissimilarities

  • Oh, Man-Suk;Raftery, Adrian
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.9-14
    • /
    • 2003
  • A Bayesian model-based clustering method is proposed for clustering objects on the basis of dissimilarites. This combines two basic ideas. The first is that tile objects have latent positions in a Euclidean space, and that the observed dissimilarities are measurements of the Euclidean distances with error. The second idea is that the latent positions are generated from a mixture of multivariate normal distributions, each one corresponding to a cluster. We estimate the resulting model in a Bayesian way using Markov chain Monte Carlo. The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties. In the examples we studied, the clustering results based on low-dimensional configurations were almost as good as those based on high-dimensional ones. Thus tile method can be used as a tool for dimension reduction when clustering high-dimensional objects, which may be useful especially for visual inspection of clusters. We also propose a Bayesian criterion for choosing the dimension of the object configuration and the number of clusters simultaneously. This is easy to compute and works reasonably well in simulations and real examples.

  • PDF

Multivariate SPC Charts for On-line Monitoring the Batch Processes (배치 공정의 온라인 모니터링을 위한 다변량 관리도)

  • Lee Bae Jin;Kang Chang Wook
    • Proceedings of the Society of Korea Industrial and System Engineering Conference
    • /
    • 2002.05a
    • /
    • pp.387-396
    • /
    • 2002
  • Batch processes are a significant class of processes in the process industry and play an important role in the production of high quality speciality materials. Examples include the production of semiconductors, chemicals, pharmaceuticals, and biochemicals. With on-line sensors connected to most batch processes, massive amounts of data are being collected routinely during the batch on easily measured process variables such as temperatures, pressures, and flowrates. In this paper, multivariate SPC charts for on-line monitoring of the progress of new batches are developed which utilize the information in the on-line measurements in real-time. We propose the formation of statistical model which describes the normal operation of a batch at each time interval during the batch operation. An on-line monitoring scheme based on the proposed method can handle both cross-correlation among process variables at any one time and auto-correlation over time. And the control limits for the monitoring charts are established from sound statistical framework unlike previous researches which use the external reference distribution. The proposed charts perform real-time, on-line monitoring to ensure that the batch is progressing in a manner that will lead to a high-quality product or to detect and indicate faults that can be corrected prior to completion of the batch. This approach is capable of tracking the progress of new batch runs, identifying the time periods in which the fault occurred and detecting underlying cause.

  • PDF

Application of metabolic profiling for biomarker discovery

  • Hwang, Geum-Sook
    • Proceedings of the Korean Society of Applied Pharmacology
    • /
    • 2007.11a
    • /
    • pp.19-27
    • /
    • 2007
  • An important potential of metabolomics-based approach is the possibility to develop fingerprints of diseases or cellular responses to classes of compounds with known common biological effect. Such fingerprints have the potential to allow classification of disease states or compounds, to provide mechanistic information on cellular perturbations and pathways and to identify biomarkers specific for disease severity and drug efficacy. Metabolic profiles of biological fluids contain a vast array of endogenous metabolites. Changes in those profiles resulting from perturbations of the system can be observed using analytical techniques, such as NMR and MS. $^1H$ NMR was used to generate a molecular fingerprint of serum or urinary sample, and then pattern recognition technique was applied to identity molecular signatures associated with the specific diseases or drug efficiency. Several metabolites that differentiate disease samples from the control were thoroughly characterized by NMR spectroscopy. We investigated the metabolic changes in human normal and clinical samples using $^1H$ NMR. Spectral data were applied to targeted profiling and spectral binning method, and then multivariate statistical data analysis (MVDA) was used to examine in detail the modulation of small molecule candidate biomarkers. We show that targeted profiling produces robust models, generates accurate metabolite concentration data, and provides data that can be used to help understand metabolic differences between healthy and disease population. Such metabolic signatures could provide diagnostic markers for a disease state or biomarkers for drug response phenotypes.

  • PDF

Classification of Forest Cover Types in the Baekdudaegan, South Korea

  • Chung, Sang Hoon;Lee, Sang Tae
    • Journal of Forest and Environmental Science
    • /
    • v.37 no.4
    • /
    • pp.269-279
    • /
    • 2021
  • This study was carried out to introduce the forest cover types of the Baekdudaegan inhabiting the number of native tree species. In order to understand the vegetation distribution characteristics of the Baekdudaegan, a vegetation survey was conducted on the major 20 mountains of the Baekdudaegan. The vegetation data were collected from 3,959 sample points by the point-centered quarter method. Each mountain was classified into 4-7 forests by using various multivariate statistical methods such as cluster analysis, indicator species analysis, multiple discriminant analysis, and species composition analysis. The forests were classified mainly according to the relative abundance of Quercus mongolica. There was a total of 111 classified forests and these forests were integrated into the following nine forest cover types using the percentage similarity index and by clustering according to vegetation type: 1) Mongolian oak, 2) Mongolian oak and other deciduous, 3) Oaks (Mixed Quercus spp.), 4) Korean red pine, 5) Korean red pine and oaks, 6) ash, 7) mixed mesophytic, 8) subalpine zone coniferous, and 9) miscellaneous forest. Forests grouped within the subalpine zone coniferous and miscellaneous classifications were characterized by similar environmental conditions and those forests that did not fit in any other category, respectively.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

Bayesian inference on multivariate asymmetric jump-diffusion models (다변량 비대칭 라플라스 점프확산 모형의 베이지안 추론)

  • Lee, Youngeun;Park, Taeyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.1
    • /
    • pp.99-112
    • /
    • 2016
  • Asymmetric jump-diffusion models are effectively used to model the dynamic behavior of asset prices with abrupt asymmetric upward and downward changes. However, the estimation of their extension to the multivariate asymmetric jump-diffusion model has been hampered by the analytically intractable likelihood function. This article confronts the problem using a data augmentation method and proposes a new Bayesian method for a multivariate asymmetric Laplace jump-diffusion model. Unlike the previous models, the proposed model is rich enough to incorporate all possible correlated jumps as well as mention individual and common jumps. The proposed model and methodology are illustrated with a simulation study and applied to daily returns for the KOSPI, S&P500, and Nikkei225 indices data from January 2005 to September 2015.

PROCESS ANALYSIS OF AUTOMOTIVE PARTS USING GRAPHICAL MODELLING

  • IRIKURA Norio;KUZUYA Kazuyoshi;NISHINA Ken
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 1998.11a
    • /
    • pp.295-300
    • /
    • 1998
  • Recently graphical modelling is being studied as a useful process analysis tool for exploratory causal analysis. Graphical modelling is a presentation method that uses graphs to describe statistical models of the structures of multivariate data. This paper describes an application of this graphical modeling with two cases from the automotive parts industry. One case is the unbalance problem of the pulley, an automotive generator part. There is multivariate data of the product from each of the processes which are connected in the series. By means of exploratory causal analysis between the variables using graphical modeling, the key processes which causes the variation of the final characteristics and their mechanism of the causal relationship have become clear. Another case is, also, the unbalanced problem of automotive starter parts which consists of many parts and is manufactured by complex machinery and assembling process. By means of the similar technique, the key processes are obtained easily and the results are reasonable from technical knowledge.

  • PDF

Estimation of anthropometric body dimensions and joint strengths of a worker performing manual materials handling tasks using a multivariate normal simulation model (다변량 정규분포 모의모형을 이용한 물자운반작업을 수행하는 작업자의 인체 치수 및 관절염력의 예측에 관한 연구)

  • 변승남
    • Journal of the Ergonomics Society of Korea
    • /
    • v.12 no.2
    • /
    • pp.63-83
    • /
    • 1993
  • The primary objective of the research is to develop a mathematical method to incorporate the variability of anthropometric body dimensions and joint strengths of individuals in a biomechanical analysis. A multivariate normal simulation model estimated anthropometric body dimensions and joint strengths of the random link-person, based on the assumptions that the vari- ables of body dimensions and joint strengths are correlated and follow normal distributions. Statistical comparative analysis demonstrated that the random link-person represented a more realistic human-like form in an anthropometric sense than the proportional link-person whose body dimensions were estimated proportionally. Estimated joint strengths for the random link-person, however, did not match the measured joint strengths as closely as the estimated body dimensions. The random link-person will allow biomechanical analysis of manual materials handling tasks to be individualized with respect to the anthropometry and a static strength.

  • PDF

Multivariate Gamma-Poisson Model and Parameter Estimation for Polytomous Data : Application to Defective Pixels of LCD (다가자료에 적합한 다변수 감마-포아송 모델과 파라미터 추정방법 : LCD 화소불량 응용)

  • Ha, Jung-Hoon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.34 no.1
    • /
    • pp.42-51
    • /
    • 2011
  • Poisson model and Gamma-Poisson model are popularly used to analyze statistical behavior from defective data. The methods are based on binary criteria, that is, good or failure. However, manufacturing industries prefer polytomous criteria for classifying manufactured products due to flexibility of marketing. In this paper, I introduce two multivariate Gamma-Poisson(MGP) models and estimation methods of the parameters in the models, which are able to handle polytomous data. The models and estimators are verified on defective pixels of LCD manufacturing. Experimental results show that both the independent MGP model and the multinomial MGP model have excellent performance in terms of mean absolute deviation and the choice of method depends on the purpose of use.