• Title/Summary/Keyword: Multi-statistical Methods

Search Result 296, Processing Time 0.026 seconds

Unrelated Question Model in Sensitive Multi-Character Surveys

  • Sidhu, Sukhjinder Singh;Bansal, Mohan Lal;Kim, Jong-Min;Singh, Sarjinder
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.1
    • /
    • pp.169-183
    • /
    • 2009
  • The simplicity and wide application of Greenberg et al. (1971) prompts to propose a set of alternative estimators of population total for multi-character surveys that elicit simultaneous information on many. sensitive study variables. The proposed estimators take into account the already known rough value of the correlation coefficient between Y(the characteristic under study) and p(the measure of size). These estimators are biased, but it is expected that the extent of bias will be smaller, since the proposed estimators are suitable for situations in between those optimum for the usual estimators and the estimators based on multi-characters for no correlation. The relative efficiency of the proposed estimators has been studied under a super population model through empirical study. It has been found through simulation study that a choice of an unrelated variable in the Greenberg et al. (1971) model could be made based on its correlation with the auxiliary variable used at estimation stage in multi-character surveys.

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

  • Kim, Kipoong;Sun, Hokeun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.535-546
    • /
    • 2020
  • In genetic association studies, pleiotropy is a phenomenon where a variant or a genetic region affects multiple traits or diseases. There have been many studies identifying cross-phenotype genetic associations. But, most of statistical approaches for detection of pleiotropy are based on individual tests where a single variant association with multiple traits is tested one at a time. These approaches fail to account for relations among correlated variants. Recently, multivariate regularization methods have been proposed to detect pleiotropy in analysis of high-dimensional genomic data. However, they suffer a problem of tuning parameter selection, which often results in either too many false positives or too small true positives. In this article, we applied selection probability to multivariate regularization methods in order to identify pleiotropic variants associated with multiple phenotypes. Selection probability was applied to individual elastic-net, unified elastic-net and multi-response elastic-net regularization methods. In simulation studies, selection performance of three multivariate regularization methods was evaluated when the total number of phenotypes, the number of phenotypes associated with a variant, and correlations among phenotypes are different. We also applied the regularization methods to a wild bean dataset consisting of 169,028 variants and 17 phenotypes.

PCA vs. ICA for Face Recognition

  • Lee, Oyoung;Park, Hyeyoung;Park, Seung-Jin
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.873-876
    • /
    • 2000
  • The information-theoretic approach to face recognition is based on the compact coding where face images are decomposed into a small set of basis images. Most popular method for the compact coding may be the principal component analysis (PCA) which eigenface methods are based on. PCA based methods exploit only second-order statistical structure of the data, so higher- order statistical dependencies among pixels are not considered. Independent component analysis (ICA) is a signal processing technique whose goal is to express a set of random variables as linear combinations of statistically independent component variables. ICA exploits high-order statistical structure of the data that contains important information. In this paper we employ the ICA for the efficient feature extraction from face images and show that ICA outperforms the PCA in the task of face recognition. Experimental results using a simple nearest classifier and multi layer perceptron (MLP) are presented to illustrate the performance of the proposed method.

  • PDF

Applications of response dimension reduction in large p-small n problems

  • Minjee Kim;Jae Keun Yoo
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.191-202
    • /
    • 2024
  • The goal of this paper is to show how multivariate regression analysis with high-dimensional responses is facilitated by the response dimension reduction. Multivariate regression, characterized by multi-dimensional response variables, is increasingly prevalent across diverse fields such as repeated measures, longitudinal studies, and functional data analysis. One of the key challenges in analyzing such data is managing the response dimensions, which can complicate the analysis due to an exponential increase in the number of parameters. Although response dimension reduction methods are developed, there is no practically useful illustration for various types of data such as so-called large p-small n data. This paper aims to fill this gap by showcasing how response dimension reduction can enhance the analysis of high-dimensional response data, thereby providing significant assistance to statistical practitioners and contributing to advancements in multiple scientific domains.

Implementation of simple statistical pattern recognition methods for harmful gases classification using gas sensor array fabricated by MEMS technology (MEMS 기술로 제작된 가스 센서 어레이를 이용한 유해가스 분류를 위한 간단한 통계적 패턴인식방법의 구현)

  • Byun, Hyung-Gi;Shin, Jeong-Suk;Lee, Ho-Jun;Lee, Won-Bae
    • Journal of Sensor Science and Technology
    • /
    • v.17 no.6
    • /
    • pp.406-413
    • /
    • 2008
  • We have been implemented simple statistical pattern recognition methods for harmful gases classification using gas sensors array fabricated by MEMS (Micro Electro Mechanical System) technology. The performance of pattern recognition method as a gas classifier is highly dependent on the choice of pre-processing techniques for sensor and sensors array signals and optimal classification algorithms among the various classification techniques. We carried out pre-processing for each sensor's signal as well as sensors array signals to extract features for each gas. We adapted simple statistical pattern recognition algorithms, which were PCA (Principal Component Analysis) for visualization of patterns clustering and MLR (Multi-Linear Regression) for real-time system implementation, to classify harmful gases. Experimental results of adapted pattern recognition methods with pre-processing techniques have been shown good clustering performance and expected easy implementation for real-time sensing system.

Solving Multi-class Problem using Support Vector Machines (Support Vector Machines을 이용한 다중 클래스 문제 해결)

  • Ko, Jae-Pil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.12
    • /
    • pp.1260-1270
    • /
    • 2005
  • Support Vector Machines (SVM) is well known for a representative learner as one of the kernel methods. SVM which is based on the statistical learning theory shows good generalization performance and has been applied to various pattern recognition problems. However, SVM is basically to deal with a two-class classification problem, so we cannot solve directly a multi-class problem with a binary SVM. One-Per-Class (OPC) and All-Pairs have been applied to solve the face recognition problem, which is one of the multi-class problems, with SVM. The two methods above are ones of the output coding methods, a general approach for solving multi-class problem with multiple binary classifiers, which decomposes a complex multi-class problem into a set of binary problems and then reconstructs the outputs of binary classifiers for each binary problem. In this paper, we introduce the output coding methods as an approach for extending binary SVM to multi-class SVM and propose new output coding schemes based on the Error-Correcting Output Codes (ECOC) which is a dominant theoretical foundation of the output coding methods. From the experiment on the face recognition, we give empirical results on the properties of output coding methods including our proposed ones.

Estimating Parameters in Muitivariate Normal Mixtures

  • Ahn, Sung-Mahn;Baik, Sung-Wook
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.357-365
    • /
    • 2011
  • This paper investigates a penalized likelihood method for estimating the parameter of normal mixtures in multivariate settings with full covariance matrices. The proposed model estimates the number of components through the addition of a penalty term to the usual likelihood function and the construction of a penalized likelihood function. We prove the consistency of the estimator and present the simulation results on the multi-dimensional nor-mal mixtures up to the 8-dimension.

Multivariate Normality Tests Based on Principal Components

  • Kim, Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.765-777
    • /
    • 2003
  • In this paper, we investigate some measures as tests of multivariate normality based on principal components. The idea was proposed by Srivastava and Hui(1987). They generalized Shapiro-Wilk statistic for multi variate cases. We show the null distributions of the statistics do not depend on the unknown parameters and mention the asymptotic null distributions. Also power performance of the tests are assessed in a Monte Carlo study.

Multi-Level Skip-Lot Sampling Plan-Average Fraction Inspected Properties

  • In-Suk Lee;Gyo-Young Cho;Hae-Rim Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.2
    • /
    • pp.151-159
    • /
    • 1996
  • The general formulas of average fraction inspected, average sample number and average outgoing quality in n-level skip-lot sampling plan are derived. Average sample number and average outgoing quality of a reference plan, three-level, five-level and ten-level skip-lot sampling plans are compared.

  • PDF

Multi-target Classification Method Based on Adaboost and Radial Basis Function (아이다부스트(Adaboost)와 원형기반함수를 이용한 다중표적 분류 기법)

  • Kim, Jae-Hyup;Jang, Kyung-Hyun;Lee, Jun-Haeng;Moon, Young-Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.3
    • /
    • pp.22-28
    • /
    • 2010
  • Adaboost is well known for a representative learner as one of the kernel methods. Adaboost which is based on the statistical learning theory shows good generalization performance and has been applied to various pattern recognition problems. However, Adaboost is basically to deal with a two-class classification problem, so we cannot solve directly a multi-class problem with Adaboost. One-Vs-All and Pair-Wise have been applied to solve the multi-class classification problem, which is one of the multi-class problems. The two methods above are ones of the output coding methods, a general approach for solving multi-class problem with multiple binary classifiers, which decomposes a complex multi-class problem into a set of binary problems and then reconstructs the outputs of binary classifiers for each binary problem. However, two methods cannot show good performance. In this paper, we propose the method to solve a multi-target classification problem by using radial basis function of Adaboost weak classifier.