• Title/Summary/Keyword: Hotelling′s T$^2$

Search Result 30, Processing Time 0.027 seconds

Notes on identifying source of out-of-control signals in phase II multivariate process monitoring (다변량 공정 모니터링에서 이상신호 발생시 원인 식별에 관한 연구)

  • Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.1-11
    • /
    • 2018
  • Multivariate process control has become important in various applied fields. For instance, there are many situations in which the simultaneous monitoring of multivariate quality characteristics is necessary for the manufacturing industry. Despite its importance, its practical usage is not as convenient because it is difficult to identify the source of the out-of-control signal in a multivariate control chart. In this paper, we will introduce how to detect the source of the out-of-control by using confidence intervals for new observations, and will discuss the identification and interpretation of the out-of-control variable through simulation studies.

Identification of Differentially Expressed Genes Using Tests Based on Multiple Imputations

  • Kim, Sang Cheol;Yu, Donghyeon
    • Quantitative Bio-Science
    • /
    • v.36 no.1
    • /
    • pp.23-31
    • /
    • 2017
  • Datasets from DNA microarray experiments, which are in the form of large matrices of expression levels of genes, often have missing values. However, the existing statistical methods including the principle components analysis (PCA) and Hotelling's t-test are not directly applicable for the datasets having missing values due to the fact that they assume the observed dataset is complete in general. Many methods have been proposed in previous literature to impute the missing in the observed data. Troyanskaya et al. [1] study the k-nearest neighbor (kNN) imputation, Kim et al. [2] propose the local least squares (LLS) method and Rubin [3] propose the multiple imputation (MI) for missing values. To identify differentially expressed genes, we propose a new testing procedure when the missing exists in the observed data. The proposed procedure uses the Stouffer's z-scores and combines the test results of individual imputed samples, which are dependent to each other. We numerically show that the proposed test procedure based on MI performs better than the existing test procedures based on single imputation (SI) by comparing their ROC curves. We apply the proposed method to analyzing a public microarray data.

Analysis on Effect of Health Promotion Program for the Patients with Rheumatoid Arthritis (환자를 위한 건강증진 프로그램의 효과 분석 연구)

  • Oh, Hyun-Soo;Kim, Young-Ran;Park, Won;Song, Jeong-Soo
    • Journal of Korean Academy of Nursing
    • /
    • v.30 no.2
    • /
    • pp.342-353
    • /
    • 2000
  • This study was performed to examine the effect of a 7-week comprehensive health promotion program for improving pain, depression, and disability by employing a quasi-experimental design. The subjects were regular out-patients of a RA aclinic in an University Hospital in Inchon from November 11. 1998 to December 24. 1998. The Experimental group included 18 patients who participated in an arthritis health promotion program, and the control group included 18 patients who did not. The 7-week health promotion program, which had the objective to enhance health promoting skills, was provided to patients. The effects of this program on the patients' pain, depression, and functional disability were examined. According to the study results, a significant group difference was found on these dependent variables (Hotelling's T =.30, F=3.11, p=.04). To examine which dependent variables had significant effects, one-way ANOVAs were performed. There were significant group differences in pain (F=4.35, p=.05) and in depression (F=4.22, p=.05) However, no significant group differences on functional disability (F=.04, p=.84) were found. Conclusively, the arthritis health promotion program, which was designed to enhance 11 health promoting skills, can be evaluated as successfully achieving the ultimate goal of enhancing the patients' quality of life. It can also be contended that the improvement of the patients' quality of life was enabled by relieving pain and reducing depression.

  • PDF

Multiple Group Testing Procedures for Analysis of High-Dimensional Genomic Data

  • Ko, Hyoseok;Kim, Kipoong;Sun, Hokeun
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.187-195
    • /
    • 2016
  • In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's $T^2$ test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator) in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.

Bearing fault detection through multiscale wavelet scalogram-based SPC

  • Jung, Uk;Koh, Bong-Hwan
    • Smart Structures and Systems
    • /
    • v.14 no.3
    • /
    • pp.377-395
    • /
    • 2014
  • Vibration-based fault detection and condition monitoring of rotating machinery, using statistical process control (SPC) combined with statistical pattern recognition methodology, has been widely investigated by many researchers. In particular, the discrete wavelet transform (DWT) is considered as a powerful tool for feature extraction in detecting fault on rotating machinery. Although DWT significantly reduces the dimensionality of the data, the number of retained wavelet features can still be significantly large. Then, the use of standard multivariate SPC techniques is not advised, because the sample covariance matrix is likely to be singular, so that the common multivariate statistics cannot be calculated. Even though many feature-based SPC methods have been introduced to tackle this deficiency, most methods require a parametric distributional assumption that restricts their feasibility to specific problems of process control, and thus limit their application. This study proposes a nonparametric multivariate control chart method, based on multiscale wavelet scalogram (MWS) features, that overcomes the limitation posed by the parametric assumption in existing SPC methods. The presented approach takes advantage of multi-resolution analysis using DWT, and obtains MWS features with significantly low dimensionality. We calculate Hotelling's $T^2$-type monitoring statistic using MWS, which has enough damage-discrimination ability. A bootstrap approach is used to determine the upper control limit of the monitoring statistic, without any distributional assumption. Numerical simulations demonstrate the performance of the proposed control charting method, under various damage-level scenarios for a bearing system.

반도체 공정 신호의 이상탐지 및 분류를 위한 자기구상지도 기반 기법에 관한 연구

  • Yun, Jae-Jun;Park, Jeong-Sul;Baek, Jun-Geol
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2011.02a
    • /
    • pp.36-36
    • /
    • 2011
  • 반도체 공정 신호는 주기 신호와 비주기 신호로 구분된다. 특정 패턴을 가지는 주기 신호는 해당 파라미터(parameter)에 대해서 패턴 매칭을 수행하여 관리하는 연구가 진행되고 있다. 반면 비주기 신호 데이터의 경우에는 패턴 매칭 방법을 수행할 수 없다. 또한 반도체 공정에서 얻을 수 있는 두 개 타입의 데이터는 그 파라미터가 방대하기 때문에 현재 실제 공정에 적용되고 있는 방식인 각각 하나의 파라미터에 대해 관리도(control chart)를 구성해 관리하는 것은 많은 비용과 시간의 낭비를 초래한다. 따라서 두 타입 데이터의 여러 개의 파라미터를 동시에 관측할 수 있고 파라미터간의 내재된 상관관계를 고려할 수 있는 장점을 가진 분석 기법에 대한 연구가 필요하다. 주기 신호의 이상탐지를 위한 기존 연구는 신호를 구간으로 나누어 구간별로 SPC 차트적용 시키는 방법, 각 시점 마다 측정되는 값을 하나의 변수로 고려하여 Hotelling's T square, PCA, PLS 등과 같은 다변량 통계 분석을 적용 시키는 방법들이 제시되어 왔다. 이러한 방법들은 다양한 특성을 가지는 주기신호를 분석하고 이상을 탐지 하는데 많은 한계점을 가진다. 이에 본 논문은 다양한 형태를 가지는 신호의 특성을 반영하여 자기구상지도를 기반으로 신호의 분류와 공정의 이상을 탐지하는 기법을 제안한다. 제안하는 기법은 자기구상지도를 이용하여 복잡한(고차원, 시계열) 신호를 2차원 상의 노드로 맵핑시킴으로써 신호의 특질(feature)을 추출하고 새로 표현된 신호의 특질을 기반으로 Logistic regression을 적용시켜 이상을 탐지 한다. 다양한 이상 상황을 가진 반도체 공정 신호를 사용하여 제안한 이상탐지 성능을 평가하였다.

  • PDF

Region Based Image Similarity Search using Multi-point Relevance Feedback (다중점 적합성 피드백방법을 이용한 영역기반 이미지 유사성 검색)

  • Kim, Deok-Hwan;Lee, Ju-Hong;Song, Jae-Won
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.857-866
    • /
    • 2006
  • Performance of an image retrieval system is usually very low because of the semantic gap between the low level feature and the high level concept in a query image. Semantically relevant images may exhibit very different visual characteristics, and may be scattered in several clusters. In this paper, we propose a content based image rertrieval approach which combines region based image retrieval and a new relevance feedback method using adaptive clustering together. Our main goal is finding semantically related clusters to narrow down the semantic gap. Our method consists of region based clustering processes and cluster-merging process. All segmented regions of relevant images are organized into semantically related hierarchical clusters, and clusters are merged by finding the number of the latent clusters. This method, in the cluster-merging process, applies r: using v principal components instead of classical Hotelling's $T_v^2$ [1] to find the unknown number of clusters and resolve the singularity problem in high dimensions and demonstrate that there is little difference between the performance of $T^2$ and that of $T_v^2$. Experiments have demonstrated that the proposed approach is effective in improving the performance of an image retrieval system.

Detection of the Change in Blogger Sentiment using Multivariate Control Charts (다변량 관리도를 활용한 블로거 정서 변화 탐지)

  • Moon, Jeounghoon;Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.903-913
    • /
    • 2013
  • Social network services generate a considerable amount of social data every day on personal feelings or thoughts. This social data provides changing patterns of information production and consumption but are also a tool that reflects social phenomenon. We analyze negative emotional words from daily blogs to detect the change in blooger sentiment using multivariate control charts. We used the all the blogs produced between 1 January 2008 and 31 December 2009. Hotelling's T-square control chart control chart is commonly used to monitor multivariate quality characteristics; however, it assumes that quality characteristics follow multivariate normal distribution. The performance of a multivariate control chart is affected by this assumption; consequently, we introduce the support vector data description and its extension (K-control chart) suggested by Sun and Tsung (2003) and they are applied to detect the chage in blogger sentiment.

Fault Detection in LDPE Process using Machine Learning Techniques (머신러닝 기법을 활용한 LDPE 공정의 이상 감지)

  • Lee, Changsong;Lee, Kyu-Hwang;Lee, Hokyung
    • Korean Chemical Engineering Research
    • /
    • v.58 no.2
    • /
    • pp.224-229
    • /
    • 2020
  • We propose a machine learning-based method for proactively detecting faults in LDPE processes and predicting equipment lifespan. It is important to detect and prevent unexpected faults in chemical processes in order to maximize safety and productivity. Since LDPE process is a high-pressure process up to 3,000 kg/㎠g or more, once ESD occurs, it can result in productivity loss due to increased maintenance periods. By collecting key variables operation data of the process and using unsupervised machine leaning methods, we developed a fault detection model which detected 4 ESDs 2.4 days prior to the occurrence. In addition, it was confirmed that the life expectancy of a hyper compressor can be predicted by using the physically significant key variables.

A Study on Fault Detection of Cycle-based Signals using Wavelet Transform (웨이블릿을 이용한 주기 신호 데이터의 이상 탐지에 관한 연구)

  • Lee, Jae-Hyun;Kim, Ji-Hyun;Hwang, Ji-Bin;Kim, Sung-Shick
    • Journal of the Korea Society for Simulation
    • /
    • v.16 no.4
    • /
    • pp.13-22
    • /
    • 2007
  • Fault detection of cycle-based signals is typically performed using statistical approaches. Univariate SPC using few representative statistics and multivariate analysis methods such as PCA and PLS are the most popular methods for analyzing cycle-based signals. However, such approaches are limited when dealing with information-rich cycle-based signals. In this paper, process fault defection method based on wavelet analysis is proposed. Using Haar wavelet, coefficients that well reflect the process condition are selected. Next, Hotelling's $T^2$ chart using selected coefficients is constructed for assessment of process condition. To enhance the overall efficiency of fault detection, the following two steps are suggested, i.e. denoising method based on wavelet transform and coefficient selection methods using variance difference. For performance evaluation, various types of abnormal process conditions are simulated and the proposed algorithm is compared with other methodologies.

  • PDF