• Title/Summary/Keyword: Multivariate Statistical Method

Search Result 294, Processing Time 0.03 seconds

Detecting cell cycle-regulated genes using Self-Organizing Maps with statistical Phase Synchronization (SOMPS) algorithm

  • Kim, Chang Sik;Tcha, Hong Joon;Bae, Cheol-Soo;Kim, Moon-Hwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.1 no.2
    • /
    • pp.39-50
    • /
    • 2008
  • Developing computational methods for identifying cell cycle-regulated genes has been one of important topics in systems biology. Most of previous methods consider the periodic characteristics of expression signals to identify the cell cycle-regulated genes. However, we assume that cell cycle-regulated genes are relatively active having relatively many interactions with each other based on the underlying cellular network. Thus, we are motivated to apply the theory of multivariate phase synchronization to the cell cycle expression analysis. In this study, we apply the method known as "Self-Organizing Maps with statistical Phase Synchronization (SOMPS)", which is the combination of self-organizing map and multivariate phase synchronization, producing several subsets of genes that are expected to have interactions with each other in their subset (Kim, 2008). Our evaluation experiments show that the SOMPS algorithm is able to detect cell cycle-regulated genes as much as one of recently reported method that performs better than most existing methods.

  • PDF

Modified partial least squares method implementing mixed-effect model

  • Kyunga Kim;Shin-Jae Lee;Soo-Heang Eo;HyungJun Cho;Jae Won Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.1
    • /
    • pp.65-73
    • /
    • 2023
  • Contemporary biomedical data often involve an ill-posed problem owing to small sample size and large number of multi-collinear variables. Partial least squares (PLS) method could be a plausible alternative to an ill-conditioned ordinary least squares. However, in the case of a PLS model that includes a random-effect, how to deal with a random-effect or mixed effects remains a widely open question worth further investigation. In the present study, we propose a modified multivariate PLS method implementing mixed-effect model (PLSM). The advantage of PLSM is its versatility in handling serial longitudinal data or its ability for taking a randomeffect into account. We conduct simulations to investigate statistical properties of PLSM, and showcase its real clinical application to predict treatment outcome of esthetic surgical procedures of human faces. The proposed PLSM seemed to be particularly beneficial 1) when random-effect is conspicuous; 2) the number of predictors is relatively large compared to the sample size; 3) the multicollinearity is weak or moderate; and/or 4) the random error is considerable.

A Comparison of Multivariate R-Techniques in SAS, SPSS, Minitab and S-plus (SAS, SPSS, MINITAB, 5-PLUS에서 다변량 R-기법의 비교)

  • 최용석;문희정
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.1
    • /
    • pp.153-164
    • /
    • 2004
  • In this study, we compare multivariate R-techniques in the up-to-date versions of SAS, SPSS, Minitab and S-plus. The direct input method by typing in command is considered for SAS, while the menu-driven method is considered for SPSS, Minitab and S-plus. Comparison was made in terms of input data format, input option, charts and outputs.

FAULT DETECTION, MONITORING AND DIAGNOSIS OF SEQUENCING BATCH REACTOR FOR INTEGRATED WASTEWATER TREATMENT MANAGEMENT SYSTEM

  • Yoo, Chang-Kyoo;Vanrolleghem, Peter A.;Lee, In-Beum
    • Environmental Engineering Research
    • /
    • v.11 no.2
    • /
    • pp.63-76
    • /
    • 2006
  • Multivariate analysis and batch monitoring on a pilot-scale sequencing batch reactor (SBR) are described for integrated wastewater treatment management system, where a batchwise multiway independent component analysis method (MICA) are used to extract meaningful hidden information from non-Gaussian wastewater treatment data. Three-way batch data of SBR are unfolded batch-wisely, and then a non-Gaussian multivariate monitoring method is used to capture the non-Gaussian characteristics of normal batches in biological wastewater treatment plant. It is successfully applied to an 80L SBR for biological wastewater treatment, which is characterized by a variety of error sources with non-Gaussian characteristics. The batchwise multivariate monitoring results of a pilot-scale SBR for integrated wastewater treatment management system showed more powerful monitoring performance on a WWTP application than the conventional method since it can extract non-Gaussian source signals which are independent and cross-correlation of variables.

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

A Simple Nonparametric Test of Complete Independence

  • Park, Cheol-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.2
    • /
    • pp.411-416
    • /
    • 1998
  • A simple nonparametric test of complete or total independence is suggested for continuous multivariate distributions. This procedure first discretizes the original variables based on their order statistics, and then tests the hypothesis of complete independence for the resulting contingency table. Under the hypothesis of independence, the chi-squared test statistic has an asymptotic chi-squared distribution. We present a simulation study to illustrate the accuracy in finite samples of the limiting distribution of the test statistic. We compare our method to another nonparametric test of complete independence via a simulation study. Finally, we apply our method to the residuals from a real data set.

  • PDF

A General Mixed Linear Model with Left-Censored Data

  • Ha, Il-Do
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.969-976
    • /
    • 2008
  • Mixed linear models have been widely used in various correlated data including multivariate survival data. In this paper we extend hierarchical-likelihood(h-likelihood) approach for mixed linear models with right censored data to that for left censored data. We also allow a general random-effect structure and propose the estimation procedure. The proposed method is illustrated using a numerical data set and is also compared with marginal likelihood method.

Variable Selection Based on Direction Vectors

  • Kyungmee Choi
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.1
    • /
    • pp.25-33
    • /
    • 1998
  • We review a multivariate version of Kendall's tau based on direction vectors of observations. And with this statistic we propose an analog of the forward variable selection method which selects a set of independent variables for further studies to build the eventual predicting model. This method does not assume the distributions of observations and the linear model and it is strong to the outliers with high asymptotic efficiencies relative to the parametric Pearson's correlation coefficient.

  • PDF

Development of MKDE-ebd for Estimation of Multivariate Probabilistic Distribution Functions (다변량 확률분포함수의 추정을 위한 MKDE-ebd 개발)

  • Kang, Young-Jin;Noh, Yoojeong;Lim, O-Kaung
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.32 no.1
    • /
    • pp.55-63
    • /
    • 2019
  • In engineering problems, many random variables have correlation, and the correlation of input random variables has a great influence on reliability analysis results of the mechanical systems. However, correlated variables are often treated as independent variables or modeled by specific parametric joint distributions due to difficulty in modeling joint distributions. Especially, when there are insufficient correlated data, it becomes more difficult to correctly model the joint distribution. In this study, multivariate kernel density estimation with bounded data is proposed to estimate various types of joint distributions with highly nonlinearity. Since it combines given data with bounded data, which are generated from confidence intervals of uniform distribution parameters for given data, it is less sensitive to data quality and number of data. Thus, it yields conservative statistical modeling and reliability analysis results, and its performance is verified through statistical simulation and engineering examples.

Order-Restricted Inference with Linear Rank Statistics in Microarray Data

  • Kang, Moon-Su
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.137-143
    • /
    • 2011
  • The classification of subjects with unknown distribution in a small sample size often involves order-restricted constraints in multivariate parameter setups. Those problems make the optimality of a conventional likelihood ratio based statistical inferences not feasible. Fortunately, Roy (1953) introduced union-intersection principle(UIP) which provides an alternative avenue. Multivariate linear rank statistics along with that principle, yield a considerably appropriate robust testing procedure. Furthermore, conditionally distribution-free test based upon exact permutation theory is used to generate p-values, even in a small sample. Applications of this method are illustrated in a real microarray data example (Lobenhofer et al., 2002).