• Title/Summary/Keyword: 베이지안통계

Search Result 216, Processing Time 0.024 seconds

Bayesian estimation for frequency using resampling methods (재표본 방법론을 활용한 베이지안 주파수 추정)

  • Pak, Ro Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.877-888
    • /
    • 2017
  • Spectral analysis is used to determine the frequency of time series data. We first determine the frequency of the series through the power spectrum or the periodogram and then calculate the period of a cycle that may exist in a time series. Estimating the frequency using a Bayesian technique has been developed and proven to be useful; however, the Bayesian estimator for the frequency cannot be analytically solved through mathematical equations and may be handled numerically or computationally. In this paper, we make an inference on the Bayesian frequency through both resampling a parameter by Markov chain Monte Carlo (MCMC) methods and resampling data by bootstrap methods for a time series. We take the Korean real estate price index as an example for Bayesian frequency estimation. We have found a difference in the periods between the sale price index and the long term rental price index, but the difference is not statistically significant.

Multiple imputation and synthetic data (다중대체와 재현자료 작성)

  • Kim, Joungyoun;Park, Min-Jeong
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.83-97
    • /
    • 2019
  • As society develops, the dissemination of microdata has increased to respond to diverse analytical needs of users. Analysis of microdata for policy making, academic purposes, etc. is highly desirable in terms of value creation. However, the provision of microdata, whose usefulness is guaranteed, has a risk of exposure of personal information. Several methods have been considered to ensure the protection of personal information while ensuring the usefulness of the data. One of these methods has been studied to generate and utilize synthetic data. This paper aims to understand the synthetic data by exploring methodologies and precautions related to synthetic data. To this end, we first explain muptiple imputation, Bayesian predictive model, and Bayesian bootstrap, which are basic foundations for synthetic data. And then, we link these concepts to the construction of fully/partially synthetic data. To understand the creation of synthetic data, we review a real longitudinal synthetic data example which is based on sequential regression multivariate imputation.

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

Variational Bayesian multinomial probit model with Gaussian process classification on mice protein expression level data (가우시안 과정 분류에 대한 변분 베이지안 다항 프로빗 모형: 쥐 단백질 발현 데이터에의 적용)

  • Donghyun Son;Beom Seuk Hwang
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.115-127
    • /
    • 2023
  • Multinomial probit model is a popular model for multiclass classification and choice model. Markov chain Monte Carlo (MCMC) method is widely used for estimating multinomial probit model, but its computational cost is high. However, it is well known that variational Bayesian approximation is more computationally efficient than MCMC, because it uses subsets of samples. In this study, we describe multinomial probit model with Gaussian process classification and how to employ variational Bayesian approximation on the model. This study also compares the results of variational Bayesian multinomial probit model to the results of naive Bayes, K-nearest neighbors and support vector machine for the UCI mice protein expression level data.

A Study for Forecasting Methods of ARMA-GARCH Model Using MCMC Approach (MCMC 방법을 이용한 ARMA-GARCH 모형에서의 예측 방법 연구)

  • Chae, Wha-Yeon;Choi, Bo-Seung;Kim, Kee-Whan;Park, You-Sung
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.293-305
    • /
    • 2011
  • The volatility is one of most important parameters in the areas of pricing of financial derivatives an measuring risks arising from a sudden change of economic circumstance. We propose a Bayesian approach to estimate the volatility varying with time under a linear model with ARMA(p, q)-GARCH(r, s) errors. This Bayesian estimate of the volatility is compared with the ML estimate. We also present the probability of existence of the unit root in the GARCH model.

Bayesian Detection of Multiple Change Points in a Piecewise Linear Function (구분적 선형함수에서의 베이지안 변화점 추출)

  • Kim, Joungyoun
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.589-603
    • /
    • 2014
  • When consecutive data follows different distributions(depending on the time interval) change-point detection infers where the changes occur first and then finds further inferences for each sub-interval. In this paper, we investigate the Bayesian detection of multiple change points. Utilizing the reversible jump MCMC, we can explore parameter spaces with unknown dimensions. In particular, we consider a model where the signal is a piecewise linear function. For the Bayesian inference, we propose a new Bayesian structure and build our own MCMC algorithm. Through the simulation study and the real data analysis, we verified the performance of our method.

An Improved Bayesian Spam Mail Filter based on Ch-square Statistics (카이제곱 통계량을 이용한 개선된 베이지안 스팸메일 필터)

  • Kim Jin-Sang;Choe Sang-Yeol
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.04a
    • /
    • pp.403-414
    • /
    • 2005
  • Most of the currently used spam-filters are based on a Bayesian classification technique, where some serious problems occur such as a limited precision/recall rate and the false positive error. This paper addresses a solution to the problems using a modified Bayesian classifier based on chi-square statistics. The resulting spam-filter is more accurate and flexible than traditional Bayesian spam-filters and can be a personalized one providing some parameters when the filter is teamed from training data.

  • PDF

Semiparametric Bayesian Hierarchical Selection Models with Skewed Elliptical Distribution (왜도 타원형 분포를 이용한 준모수적 계층적 선택 모형)

  • 정윤식;장정훈
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.1
    • /
    • pp.101-115
    • /
    • 2003
  • Lately there has been much theoretical and applied interest in linear models with non-normal heavy tailed error distributions. Starting Zellner(1976)'s study, many authors have explored the consequences of non-normality and heavy-tailed error distributions. We consider hierarchical models including selection models under a skewed heavy-tailed e..o. distribution proposed originally by Chen, Dey and Shao(1999) and Branco and Dey(2001) with Dirichlet process prior(Ferguson, 1973) in order to use a meta-analysis. A general calss of skewed elliptical distribution is reviewed and developed. Also, we consider the detail computational scheme under skew normal and skew t distribution using MCMC method. Finally, we introduce one example from Johnson(1993)'s real data and apply our proposed methodology.

Bayesian Analysis of Dose-Effect Relationship of Cadmium for Benchmark Dose Evaluation (카드뮴 반응용량 곡선에서의 기준용량 평가를 위한 베이지안 분석연구)

  • Lee, Minjea;Choi, Taeryon;Kim, Jeongseon;Woo, Hae Dong
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.3
    • /
    • pp.453-470
    • /
    • 2013
  • In this paper, we consider a Bayesian analysis of the dose-effect relationship of cadmium to evaluate a benchmark dose(BMD). For this purpose, two dose-response curves commonly used in the toxicity study are fitted based on Bayesian methods to the data collected from the scientific literature on cadmium toxicity. Specifically, Bayesian meta-analysis and hierarchical modeling build an overall dose-effect relationship that use a piecewise linear model and Hill model, where the inter-study heterogeneity and inter-individual variability of dose and effect such as gender, age and ethnicity are accounted. Estimation of the unknown parameters is made by using a Markov chain Monte Carlo algorithm based user-friendly software WinBUGS. Benchmark dose estimates are evaluated for various cut-offs and compared with different tested subpopulations with with gender, age and ethnicity based on these two Bayesian hierarchical models.