• Title/Summary/Keyword: EM 알고리즘

Search Result 236, Processing Time 0.024 seconds

Nonignorable Nonresponse Imputation and Rotation Group Bias Estimation on the Rotation Sample Survey (무시할 수 없는 무응답을 가지고 있는 교체표본조사에서의 무응답 대체와 교체그룹 편향 추정)

  • Choi, Bo-Seung;Kim, Dae-Young;Kim, Kee-Whan;Park, You-Sung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.361-375
    • /
    • 2008
  • We propose proper methods to impute the item nonresponse in 4-8-4 rotation sample survey. We consider nonignorable nonresponse mechanism that can happen when survey deals with sensitive question (e.g. income, labor force). We utilize modeling imputation method based on Bayesian approach to avoid a boundary solution problem. We also estimate a interview time bias using imputed data and calculate cell expectation and marginal probability on fixed time after removing estimated bias. We compare the mean squared errors and bias between maximum likelihood method and Bayesian methods using simulation studies.

Reanalysis of 2002 Donation Frequency Data: Corrections and Supplements (2002년 기부횟수 자료의 재분석: 수정 및 보완)

  • Kim, Byung Soo;Lee, Juhyung;Kim, Inyoung;Park, Su-Bum;Park, Tae-Kyu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.743-753
    • /
    • 2014
  • Kim et al. (2006) and Kim et al. (2009) reported a set of explanatory variables affecting donation frequency when they analyzed nationwide survey data on donations collected in 2002 by Volunteer 21, a nonprofit organization in Korea. The primary purpose of this paper is to correct computational errors found in Kim et al. (2006) and Kim et al. (2009), to rectify major results in the Tables and Figures and to supplement Kim et al. (2009) by providing new results. We add two logistic regressions to the ZIP and a mixture of two Poisson regressions of Kim et al. (2009). Through these two logistic regressions we could detect a set of explanatory variables affecting donation activity (0 or 1) and another set of explanatory variables, in which the volunteer (0, 1) variable is common, discriminating the infrequent donor group from the frequent donor group.

Analysis of Missing Data Using an Empirical Bayesian Method (경험적 베이지안 방법을 이용한 결측자료 연구)

  • Yoon, Yong Hwa;Choi, Boseung
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.1003-1016
    • /
    • 2014
  • Proper missing data imputation is an important procedure to obtain superior results for data analysis based on survey data. This paper deals with both a model based imputation method and model estimation method. We utilized a Bayesian method to solve a boundary solution problem in which we applied a maximum likelihood estimation method. We also deal with a missing mechanism model selection problem using forecasting results and a comparison between model accuracies. We utilized MWPE(modified within precinct error) (Bautista et al., 2007) to measure prediction correctness. We applied proposed ML and Bayesian methods to the Korean presidential election exit poll data of 2012. Based on the analysis, the results under the missing at random mechanism showed superior prediction results than under the missing not at random mechanism.

A Comparison of Bayesian and Maximum Likelihood Estimations in a SUR Tobit Regression Model (SUR 토빗회귀모형에서 베이지안 추정과 최대가능도 추정의 비교)

  • Lee, Seung-Chun;Choi, Byongsu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.991-1002
    • /
    • 2014
  • Both Bayesian and maximum likelihood methods are efficient for the estimation of regression coefficients of various Tobit regression models (see. e.g. Chib, 1992; Greene, 1990; Lee and Choi, 2013); however, some researchers recognized that the maximum likelihood method tends to underestimate the disturbance variance, which has implications for the estimation of marginal effects and the asymptotic standard error of estimates. The underestimation of the maximum likelihood estimate in a seemingly unrelated Tobit regression model is examined. A Bayesian method based on an objective noninformative prior is shown to provide proper estimates of the disturbance variance as well as other regression parameters

A Study on Shape Variability in Canonical Correlation Biplot with Missing Values (결측값이 있는 정준상관 행렬도의 형상변동 연구)

  • Hong, Hyun-Uk;Choi, Yong-Seok;Shin, Sang-Min;Ka, Chang-Wan
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.955-966
    • /
    • 2010
  • Canonical correlation biplot is a useful biplot for giving a graphical description of the data matrix which consists of the association between two sets of variables, for detecting patterns and displaying results found by more formal methods of analysis. Nevertheless, when some values are missing in data, most biplots are not directly applicable. To solve this problem, we estimate the missing data using the median, mean, EM algorithm and MCMC imputation methods according to missing rates. Even though we estimate the missing values of biplot of incomplete data, we have different shapes of biplots according to the imputation methods and missing rates. Therefore we use a RMS(root mean square) which was proposed by Shin et al. (2007) and PS(procrustes statistic) for measuring and comparing the shape variability between the original biplots and the estimated biplots.

Generalized Linear Mixed Model for Multivariate Multilevel Binomial Data (다변량 다수준 이항자료에 대한 일반화선형혼합모형)

  • Lim, Hwa-Kyung;Song, Seuck-Heun;Song, Ju-Won;Cheon, Soo-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.923-932
    • /
    • 2008
  • We are likely to face complex multivariate data which can be characterized by having a non-trivial correlation structure. For instance, omitted covariates may simultaneously affect more than one count in clustered data; hence, the modeling of the correlation structure is important for the efficiency of the estimator and the computation of correct standard errors, i.e., valid inference. A standard way to insert dependence among counts is to assume that they share some common unobservable variables. For this assumption, we fitted correlated random effect models considering multilevel model. Estimation was carried out by adopting the semiparametric approach through a finite mixture EM algorithm without parametric assumptions upon the random coefficients distribution.

Factored MLLR Adaptation for HMM-Based Speech Synthesis in Naval-IT Fusion Technology (인자화된 최대 공산선형회귀 적응기법을 적용한 해양IT융합기술을 위한 HMM기반 음성합성 시스템)

  • Sung, June Sig;Hong, Doo Hwa;Jeong, Min A;Lee, Yeonwoo;Lee, Seong Ro;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.2
    • /
    • pp.213-218
    • /
    • 2013
  • One of the most popular approaches to parameter adaptation in hidden Markov model (HMM) based systems is the maximum likelihood linear regression (MLLR) technique. In our previous study, we proposed factored MLLR (FMLLR) where each MLLR parameter is defined as a function of a control vector. We presented a method to train the FMLLR parameters based on a general framework of the expectation-maximization (EM) algorithm. Using the proposed algorithm, supplementary information which cannot be included in the models is effectively reflected in the adaptation process. In this paper, we apply the FMLLR algorithm to a pitch sequence as well as spectrum parameters. In a series of experiments on artificial generation of expressive speech, we evaluate the performance of the FMLLR technique and also compare with other approaches to parameter adaptation in HMM-based speech synthesis.

High Speed AES Implementation on 64 bits Processors (64-비트 프로세서에서 AES 고속 구현)

  • Jung, Chang-Ho;Park, Il-Hwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.6A
    • /
    • pp.51-61
    • /
    • 2008
  • This paper suggests a new way to implement high speed AES on Intel Core2 processors and AMD Athlon64 processors, which are used all over the world today. First, Core2 Processors of EM64T architecture's memory-access-instruction processing efficiency are lower than calculus-instruction processing efficiency. So, previous AES implementation techniques, which had a high rate of memory-access-instruction, could cause memory-bottleneck. To improve this problem we present the partial round key techniques that reduce the rate of memory-access-instruction. The result in Intel Core2Duo 3.0 Ghz Processors show 185 cycles/block and 2.0 Gbps's throughputs in ECB mode. This is 35 cycles/block faster than bernstein software, which is known for being the fastest way. On the other side, in AMD64 processors of AMD64 architecture, by removing bottlenecks that occur in decoding processing we could improve the speed, with the result that the Athlon64 processor reached 170 cycles/block. The result that we present is the same performance of Matsui's unpublished software.

SNP과 Haplotype 분석의 통계적 문제점들

  • Kim, Ho;Jo, Seong-Il;Seo, Yu-Sin;Hyeon, Sun-Ju;No, Jae-Jeong;Lee, Bok-Ju
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.203-207
    • /
    • 2002
  • Post-genome 시대를 맞이하여 인류는 전 유전체에서의 염기서열에 대한 정보를 가질 수 있게 되었다. 이러한 정보를 이용하여서 인간에게 나타나는 다양성을 설명하기 위해서 SNP(Single Nucleotide Polymorphism)의 연구가 활발히 되고 있다. 하지만 인간 체세포의 염색체는 2쌍으로 되어있기 때문에 이러한 정보가 어떠한 쌍의 조합(haplotype)으로 나타나는가를 고려하여야한다. 현재 실험적 방법으로 이를 고려하기에는 여러 가지 제약이 따르므로 통계적인 방법으로 이를 모형화하려는 노력(in silico haplotyping)이 시도되고 있다. 이 논문에서는 통계적으로 haplotype을 정하는 대표적인 알고리즘인 Clark's algorithm, E-M algorithm 등에 대한 고찰을 통하여 유전체통계학에 대한 소개를 하고자 한다.

  • PDF

Evaluation of DEM Accuracy from ASTER Data (ASTER 데이터에 의해 추출한 DEM 의 정확도 평가)

  • Na, Sang-Il;Park, Jong-Hwa;Shin, Hyoung-Sub
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2008.05a
    • /
    • pp.2028-2032
    • /
    • 2008
  • DEM은 도시계획, 도로건설 계획, 수해지역 예측 등의 많은 분야에서 다양하게 활용되고 있다. 특히 홍수와 같은 하천재해를 분석, 관리하기 위한 유출모형 적용 때 중요한 입력 자료로 사용되고 있다. 그러나 현재 우리나라에서 사용되는 DEM 제작방법은 과정이 복잡하고 자료 전환이 불가피하며 적용범위에 있어서도 제약이 따른다. 따라서 기존에 사용되던 방법들의 한계를 극복할 수 있는 정확한 DEM 생성 방법으로 위성영상을 이용하는 연구 및 기술개발이 진행되어 왔다. 본 연구에서는 현재 널리 사용되고 있는 DEM 생성 알고리즘을 ASTER 위성 영상에 적용하여 추출한 DEM의 정확도를 평가하고자 하였다. 정확도 평가는 USGS DEM을 사용하였으며, 그 결과 정사보정의 RMSE는 6개의 GCP에서 2 화소에 수렴하였고, 구름영역에서 고도값이 실제 지형보다 높게 나타났다. 또한, 금강 유역의 북동쪽으로 발달된 능선의 고도값은 ASTER DEM이 USGS DEM에 비해 과소평가 되었지만 영상 왼쪽에 위치한 분지는 평활한 지역으로 ASTER DEM과 USGS EM과의 차이가 거의 없는 것으로 나타났다. 또한, 산림 지역 등 능선의 고도값은 ASTER DEM이 USGS DEM에 비해 과소평가 되었지만, 분지 등 평탄 지역의 DEM은 차이가 거의 없는 것으로 나타났다.

  • PDF