• Title/Summary/Keyword: Markov chain Monte Carlo sampling

Search Result 63, Processing Time 0.025 seconds

A nonparametric Bayesian seemingly unrelated regression model (비모수 베이지안 겉보기 무관 회귀모형)

  • Jo, Seongil;Seok, Inhae;Choi, Taeryon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.627-641
    • /
    • 2016
  • In this paper, we consider a seemingly unrelated regression (SUR) model and propose a nonparametric Bayesian approach to SUR with a Dirichlet process mixture of normals for modeling an unknown error distribution. Posterior distributions are derived based on the proposed model, and the posterior inference is performed via Markov chain Monte Carlo methods based on the collapsed Gibbs sampler of a Dirichlet process mixture model. We present a simulation study to assess the performance of the model. We also apply the model to precipitation data over South Korea.

A Bayesian state-space production model for Korean chub mackerel (Scomber japonicus) stock

  • Jung, Yuri;Seo, Young Il;Hyun, Saang-Yoon
    • Fisheries and Aquatic Sciences
    • /
    • v.24 no.4
    • /
    • pp.139-152
    • /
    • 2021
  • The main purpose of this study is to fit catch-per-unit-effort (CPUE) data about Korea chub mackerel (Scomber japonicus) stock with a state-space production (SSP) model, and to provide stock assessment results. We chose a surplus production model for the chub mackerel data, namely annual yield and CPUE. Then we employed a state-space layer for a production model to consider two sources of variability arising from unmodelled factors (process error) and noise in the data (observation error). We implemented the model via script software ADMB-RE because it reduces the computational cost of high-dimensional integration and provides Markov Chain Monte Carlo sampling, which is required for Bayesian approaches. To stabilize the numerical optimization, we considered prior distributions for model parameters. Applying the SSP model to data collected from commercial fisheries from 1999 to 2017, we estimated model parameters and management references, as well as uncertainties for the estimates. We also applied various production models and showed parameter estimates and goodness of fit statistics to compare the model performance. This study presents two significant findings. First, we concluded that the stock has been overexploited in terms of harvest rate from 1999 to 2017. Second, we suggest a SSP model for the smallest goodness of fit statistics among several production models, especially for fitting CPUE data with fluctuations.

The Impact of Foreign Ownership on Capital Structure: Empirical Evidence from Listed Firms in Vietnam

  • NGUYEN, Van Diep;DUONG, Quynh Nga
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.2
    • /
    • pp.363-370
    • /
    • 2022
  • The study aims to probe the impact of foreign ownership on Vietnamese listed firms' capital structure. This study employs panel data of 288 non-financial firms listed on the Ho Chi Minh City stock exchange (HOSE) and Ha Noi stock exchange (HNX) in 2015-2019. In this research, we applied a Bayesian linear regression method to provide probabilistic explanations of the model uncertainty and effect of foreign ownership on the capital structure of non-financial listed enterprises in Vietnam. The findings of experimental analysis by Bayesian linear regression method through Markov chain Monte Carlo (MCMC) technique combined with Gibbs sampler suggest that foreign ownership has substantial adverse effects on the firms' capital structure. Our findings also indicate that a firm's size, age, and growth opportunities all have a strong positive and significant effect on its debt ratio. We found that the firms' profitability, tangible assets, and liquidity negatively and strongly affect firms' capital structure. Meanwhile, there is a low negative impact of dividends and inflation on the debt ratio. This research has ramifications for business managers since it improves a company's financial resources by developing a strong capital structure and considering foreign investment as a source of funding.

Model-independent Constraints on Type Ia Supernova Light-curve Hyperparameters and Reconstructions of the Expansion History of the Universe

  • Koo, Hanwool;Shafieloo, Arman;Keeley, Ryan E.;L'Huillier, Benjamin
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.45 no.1
    • /
    • pp.48.4-49
    • /
    • 2020
  • We reconstruct the expansion history of the universe using type Ia supernovae (SN Ia) in a manner independent of any cosmological model assumptions. To do so, we implement a nonparametric iterative smoothing method on the Joint Light-curve Analysis (JLA) data while exploring the SN Ia light-curve hyperparameter space by Markov Chain Monte Carlo (MCMC) sampling. We test to see how the posteriors of these hyperparameters depend on cosmology, whether using different dark energy models or reconstructions shift these posteriors. Our constraints on the SN Ia light-curve hyperparameters from our model-independent analysis are very consistent with the constraints from using different parameterizations of the equation of state of dark energy, namely the flat ΛCDM cosmology, the Chevallier-Polarski-Linder model, and the Phenomenologically Emergent Dark Energy (PEDE) model. This implies that the distance moduli constructed from the JLA data are mostly independent of the cosmological models. We also studied that the possibility the light-curve parameters evolve with redshift and our results show consistency with no evolution. The reconstructed expansion history of the universe and dark energy properties also seem to be in good agreement with the expectations of the standard ΛCDM model. However, our results also indicate that the data still allow for considerable flexibility in the expansion history of the universe. This work is published in ApJ.

  • PDF

Bayesian Approaches to Zero Inflated Poisson Model (영 과잉 포아송 모형에 대한 베이지안 방법 연구)

  • Lee, Ji-Ho;Choi, Tae-Ryon;Wo, Yoon-Sung
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.4
    • /
    • pp.677-693
    • /
    • 2011
  • In this paper, we consider Bayesian approaches to zero inflated Poisson model, one of the popular models to analyze zero inflated count data. To generate posterior samples, we deal with a Markov Chain Monte Carlo method using a Gibbs sampler and an exact sampling method using an Inverse Bayes Formula(IBF). Posterior sampling algorithms using two methods are compared, and a convergence checking for a Gibbs sampler is discussed, in particular using posterior samples from IBF sampling. Based on these sampling methods, a real data analysis is performed for Trajan data (Marin et al., 1993) and our results are compared with existing Trajan data analysis. We also discuss model selection issues for Trajan data between the Poisson model and zero inflated Poisson model using various criteria. In addition, we complement the previous work by Rodrigues (2003) via further data analysis using a hierarchical Bayesian model.

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

  • Jeong, Young-Seob;Jin, Sou-Young;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.81-98
    • /
    • 2013
  • Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.

An Application of Dirichlet Mixture Model for Failure Time Density Estimation to Components of Naval Combat System (디리슈레 혼합모형을 이용한 함정 전투체계 부품의 고장시간 분포 추정)

  • Lee, Jinwhan;Kim, Jung Hun;Jung, BongJoo;Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.42 no.4
    • /
    • pp.194-202
    • /
    • 2019
  • Reliability analysis of the components frequently starts with the data that manufacturer provides. If enough failure data are collected from the field operations, the reliability should be recomputed and updated on the basis of the field failure data. However, when the failure time record for a component contains only a few observations, all statistical methodologies are limited. In this case, where the failure records for multiple number of identical components are available, a valid alternative is combining all the data from each component into one data set with enough sample size and utilizing the useful information in the censored data. The ROK Navy has been operating multiple Patrol Killer Guided missiles (PKGs) for several years. The Korea Multi-Function Control Console (KMFCC) is one of key components in PKG combat system. The maintenance record for the KMFCC contains less than ten failure observations and a censored datum. This paper proposes a Bayesian approach with a Dirichlet mixture model to estimate failure time density for KMFCC. Trends test for each component record indicated that null hypothesis, that failure occurrence is renewal process, is not rejected. Since the KMFCCs have been functioning under different operating environment, the failure time distribution may be a composition of a number of unknown distributions, i.e. a mixture distribution, rather than a single distribution. The Dirichlet mixture model was coded as probabilistic programming in Python using PyMC3. Then Markov Chain Monte Carlo (MCMC) sampling technique employed in PyMC3 probabilistically estimated the parameters' posterior distribution through the Dirichlet mixture model. The simulation results revealed that the mixture models provide superior fits to the combined data set over single models.

The NHPP Bayesian Software Reliability Model Using Latent Variables (잠재변수를 이용한 NHPP 베이지안 소프트웨어 신뢰성 모형에 관한 연구)

  • Kim, Hee-Cheul;Shin, Hyun-Cheul
    • Convergence Security Journal
    • /
    • v.6 no.3
    • /
    • pp.117-126
    • /
    • 2006
  • Bayesian inference and model selection method for software reliability growth models are studied. Software reliability growth models are used in testing stages of software development to model the error content and time intervals between software failures. In this paper, could avoid multiple integration using Gibbs sampling, which is a kind of Markov Chain Monte Carlo method to compute the posterior distribution. Bayesian inference for general order statistics models in software reliability with diffuse prior information and model selection method are studied. For model determination and selection, explored goodness of fit (the error sum of squares), trend tests. The methodology developed in this paper is exemplified with a software reliability random data set introduced by of Weibull distribution(shape 2 & scale 5) of Minitab (version 14) statistical package.

  • PDF

A Study on derivation of drought severity-duration-frequency curve through a non-stationary frequency analysis (비정상성 가뭄빈도 해석 기법에 따른 가뭄 심도-지속기간-재현기간 곡선 유도에 관한 연구)

  • Jeong, Minsu;Park, Seo-Yeon;Jang, Ho-Won;Lee, Joo-Heon
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.2
    • /
    • pp.107-119
    • /
    • 2020
  • This study analyzed past drought characteristics based on the observed rainfall data and performed a long-term outlook for future extreme droughts using Representative Concentration Pathways 8.5 (RCP 8.5) climate change scenarios. Standardized Precipitation Index (SPI) used duration of 1, 3, 6, 9 and 12 months, a meteorological drought index, was applied for quantitative drought analysis. A single long-term time series was constructed by combining daily rainfall observation data and RCP scenario. The constructed data was used as SPI input factors for each different duration. For the analysis of meteorological drought observed relatively long-term since 1954 in Korea, 12 rainfall stations were selected and applied 10 general circulation models (GCM) at the same point. In order to analyze drought characteristics according to climate change, trend analysis and clustering were performed. For non-stationary frequency analysis using sampling technique, we adopted the technique DEMC that combines Bayesian-based differential evolution ("DE") and Markov chain Monte Carlo ("MCMC"). A non-stationary drought frequency analysis was used to derive Severity-Duration-Frequency (SDF) curves for the 12 locations. A quantitative outlook for future droughts was carried out by deriving SDF curves with long-term hydrologic data assuming non-stationarity, and by quantitatively identifying potential drought risks. As a result of performing cluster analysis to identify the spatial characteristics, it was analyzed that there is a high risk of drought in the future in Jeonju, Gwangju, Yeosun, Mokpo, and Chupyeongryeong except Jeju corresponding to Zone 1-2, 2, and 3-2. They could be efficiently utilized in future drought management policies.