• Title/Summary/Keyword: Multivariate Statistical Methods

Search Result 463, Processing Time 0.02 seconds

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

  • Kim, Kipoong;Sun, Hokeun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.535-546
    • /
    • 2020
  • In genetic association studies, pleiotropy is a phenomenon where a variant or a genetic region affects multiple traits or diseases. There have been many studies identifying cross-phenotype genetic associations. But, most of statistical approaches for detection of pleiotropy are based on individual tests where a single variant association with multiple traits is tested one at a time. These approaches fail to account for relations among correlated variants. Recently, multivariate regularization methods have been proposed to detect pleiotropy in analysis of high-dimensional genomic data. However, they suffer a problem of tuning parameter selection, which often results in either too many false positives or too small true positives. In this article, we applied selection probability to multivariate regularization methods in order to identify pleiotropic variants associated with multiple phenotypes. Selection probability was applied to individual elastic-net, unified elastic-net and multi-response elastic-net regularization methods. In simulation studies, selection performance of three multivariate regularization methods was evaluated when the total number of phenotypes, the number of phenotypes associated with a variant, and correlations among phenotypes are different. We also applied the regularization methods to a wild bean dataset consisting of 169,028 variants and 17 phenotypes.

Optimal Designs for Multivariate Nonparametric Kernel Regression with Binary Data

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.2 no.2
    • /
    • pp.243-248
    • /
    • 1995
  • The problem of optimal design for a nonparametric regression with binary data is considered. The aim of the statistical analysis is the estimation of a quantal response surface in two dimensions. Bias, variance and IMSE of kernel estimates are derived. The optimal design density with respect to asymptotic IMSE is constructed.

  • PDF

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

A Comparison of Methods for the Detection of Outliers in Multivariate Data

  • Hadi, Ali-S.;Joo, Hye-Seon;Son, Mun-S.
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.2
    • /
    • pp.53-67
    • /
    • 1996
  • Numerous classical as well as robust methods have been proposed in the literature for the detection of multiple outlier in multivariate data. The effectiveness and power of each of these methods have not been thoroughly investigated. In this paper we first reduce the vast number of outlier detection methods to a small number of viable ones. This reduction is based on previous work of other researches and on some theoretical arguments. Then we design and implement a Monte Carlo experiment for comparing these methods. The main goal of our study is to determine which methods are most powerful in the detection of multiple outlier and in dealing with the masking and swamping problems. The results of the Monte Carlo study indicate that two of the methods seem to hace better performances than the others for the detection of multiple outlier in multivariate data.

  • PDF

Influence Analysis of the Liklihood Ratio Test in Multivariate Behrens-Fisher Problem

  • Jung, Kang-Mo;Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.3
    • /
    • pp.939-946
    • /
    • 1999
  • We propose methods for detecting influential observations that have a large influence on the likelihood ratio test statistic for the multivariate Behrens-Fisher problem. For this purpose we derive the influence curve and the derivative influence of the likelihood ratio test statistic. An illustrative example is given to show the effectiveness of the proposed methods on the identification of influential observations.

  • PDF

Statistical Methods for Multivariate Missing Data in Health Survey Research (보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법)

  • Kim, Dong-Kee;Park, Eun-Cheol;Sohn, Myong-Sei;Kim, Han-Joong;Park, Hyung-Uk;Ahn, Chae-Hyung;Lim, Jong-Gun;Song, Ki-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.31 no.4 s.63
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

Asymptotic Relative Efficiencies of Chaudhuri′s Estimators for the Multivariate One Sample Location Problem

  • Park, Kyungmee
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.875-883
    • /
    • 2001
  • We derive the asymptotic relative efficiencies in two special cases of Chaudhuri's estimators for the multivariate one sample problem. And we compare those two when observations are independent and identically distributed from a family of spherically symmetric distributions including normal distributions.

  • PDF

On the Distribution of the Scaled Residuals under Multivariate Normal Distributions

  • Cheolyong Park
    • Communications for Statistical Applications and Methods
    • /
    • v.5 no.3
    • /
    • pp.591-597
    • /
    • 1998
  • We prove (at least empirically) that some forms of the scaled residuals calculated from i.i.d. multivariate normal random vectors are ancillary. We further show that, if the scaled residuals are ancillary, then they have the same distribution whatever form of rotation is rosed to remove sample correlations.

  • PDF

Robust Estimation and Outlier Detection

  • Myung Geun Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.1 no.1
    • /
    • pp.33-40
    • /
    • 1994
  • The conditional expectation of a random variable in a multivariate normal random vector is a multiple linear regression on its predecessors. Using this fact, the least median of squares estimation method developed in a multiple linear regression is adapted to a multivariate data to identify influential observations. The resulting method clearly detect outliers and it avoids the masking effect.

  • PDF