• Title/Summary/Keyword: 베이지안 모형 선택

Search Result 58, Processing Time 0.028 seconds

Predicting Financial Success of a Movie Using Bayesian Choice Model (베이지안 선택 모형을 이용한 영화흥행 예측)

  • Lee Gyeong-Jae;Jang U-Jin
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.1851-1856
    • /
    • 2006
  • 영화는 대표적인 경험재로 가치판단이 주관적이고 제품 수명주기가 매우 짧아 예측의 불확실성이 높기 때문에 이를 정량적인 방법으로 모형화하기는 쉽지 않다. 이러한 한계점에도 불구하고 한 영화의 상업적 성공을 예측하는 것은 영화 제작자나 배급사, 극장 등 모든 주체에게 수익과 직결되는 중요한 문제이기 때문에 지금까지 다양한 통계 모형이 제시되었다. 그러나 이들 모형의 대부분은 영화흥행에는 영향을 미치나 측정할 수 없는 효과를 반영하지 못한다거나, 추정 모수의 효과가 모든 영화에 대해서 같다는 동일성 가정으로 인해 영화간 이질성을 고려하지 못하고 있다. 따라서, 본 연구에서는 추정 모수의 사전분포를 모호사전분포로 정의함으로써 변수들의 불확실성을 반영할 수 있고, 영화간 이질성을 고려할 수 있는 베이지안 선택 모형을 제안하였다. 모수의 사후분포는 마코프체인 몬테카를로 기법인 깁스 샘플러를 이용하여 추정하였다. 또한, 감독, 배우, 장르 등의 영화 별 속성 변수뿐만 아니라, 입소문에 의한 영화관람 결정 등의 구전효과와 경쟁영화의 개봉으로 인한 효과를 반영할 수 있는 변수를 추가하여 모형의 정확성을 높였다. 2005년과 2006년 상반기에 상영된 영화를 바탕으로 모형을 구축하고 인공신경망 모형과 비교한 결과, 전체적인 예측 정확도에서는 인공신경망 모형과 비슷한 결과를 보이나 상업적으로 성공한 영화를 예측하는 데에는 베이지안 선택모형이 보다 더 우수한 것으로 나타났다. 또한, 개봉 주의 경쟁심화 정도 및 개봉 첫 주의 스크린 수 등이 영화 흥행에 가장 중요한 변수로 나타났으며, 영화 개봉 전 그 영화에 대한 기대치가 높을수록 흥행 성적 또한 좋음을 알 수 있었다. 배우의 힘 및 계절성, 영화 평점 등은 이질성을 고려하지 않은 전체수준에서는 통계적으로 유의하지 않은 것으로 나타났으나, 그룹 간 이질성을 반영한 모형에서는 어느 정도 흥행한 영화를 만들기 위해서는 고려되어야 할 요소로 나타났다.렇지 않을 경우 적절한 벤치마킹 대상을 도출할 때까지 추가적인 분석과정을 반복한다. 제안한 방법을 통하여 조직은 기술적 생산 가능성 외에도 다양한 조직 운영 관점에서 적절한 벤치마킹 대상을 선정할 수 있으며, 이에 따른 목표를 수립할 수 있을 것으로 기대한다. 또한 더 나아가 global efficiency 관점에서 효율적 조직이 되기 위하여 단계적인 벤치마킹 대상 선정과 이에 따른 목표를 수립하는데도 유용하리라 판단된다.$1.20{\pm}0.37L$, 72시간에 $1.33{\pm}0.33L$로 유의한 차이를 보였으므로(F=6.153, P=0.004), 술 후 폐환기능 회복에 효과가 있다. 4) 실험군과 대조군의 수술 후 노력성 폐활량은 수술 후 72시간에서 실험군이 $1.90{\pm}0.61L$, 대조군이 $1.51{\pm}0.38L$로 유의한 차이를 보였다(t=2.620, P=0.013). 5) 실험군과 대조군의 수술 후 일초 노력성 호기량은 수술 후 24시간에서 $1.33{\pm}0.56L,\;1.00{\ge}0.28L$로 유의한 차이를 보였고(t=2.530, P=0.017), 술 후 72시간에서 $1.72{\pm}0.65L,\;1.33{\pm}0.3L$로 유의한 차이를 보였다(t=2.540, P=0.016). 6) 대상자의 술 후 폐환기능에 영향을 미치는 요인은 성별로 나타났다. 이에 따른 폐환기능의 차이를 보면, 실험군의 술 후 노력성 폐활량이 48시간에 남자($1.78{\pm}0.61L$)가 여자($1.27{\pm}0.45L$)보다 더 높게 나타났으며 (t=2.170, P=0.042), 72시간에도 역시 남자($2.16{\pm}0.56L$)가 여자($1.50{\pm}0.47L$)보다 더

  • PDF

Model selection method for categorical data with non-response (무응답을 가지고 있는 범주형 자료에 대한 모형 선택 방법)

  • Yoon, Yong-Hwa;Choi, Bo-Seung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.627-641
    • /
    • 2012
  • We consider a model estimation and model selection methods for the multi-way contingency table data with non-response or missing values. We also consider hierarchical Bayesian model in order to handle a boundary solution problem that can happen in the maximum likelihood estimation under non-ignorable non-response model and we deal with a model selection method to find the best model for the data. We utilized Bayes factors to handle model selection problem under Bayesian approach. We applied proposed method to the pre-election survey for the 2004 Korean National Assembly race. As a result, we got the non-ignorable non-response model was favored and the variable of voting intention was most suitable.

The effect investigation of the delirium by Bayesian network and radial graph (베이지안 네트워크와 방사형 그래프를 이용한 섬망의 효과 규명)

  • Lee, Jea-Young;Bae, Jae-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.911-919
    • /
    • 2011
  • In recent medical analysis, it becomes more important to looking for risk factors related to mental illness. If we find and identify their relevant characteristics of the risk factors, the disease can be prevented in advance. Moreover, the study can be helpful to medical development. These kinds of studies of risk factors for mental illness have mainly been discussed by using the logistic regression model. However in this paper, data mining techniques such as CART, C5.0, logistic, neural networks and Bayesian network were used to search for the risk factors. The Bayesian network of the above data mining methods was selected as most optimal model by applying delirium data. Then, Bayesian network analysis was used to find risk factors and the relationship between the risk factors are identified through a radial graph.

Bayesian Inference for the Zero In ated Negative Binomial Regression Model (제로팽창 음이항 회귀모형에 대한 베이지안 추론)

  • Shim, Jung-Suk;Lee, Dong-Hee;Jun, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.951-961
    • /
    • 2011
  • In this paper, we propose a Bayesian inference using the Markov Chain Monte Carlo(MCMC) method for the zero inflated negative binomial(ZINB) regression model. The proposed model allows the regression model for zero inflation probability as well as the regression model for the mean of the dependent variable. This extends the work of Jang et al. (2010) to the fully defiend ZINB regression model. In addition, we apply the proposed method to a real data example, and compare the efficiency with the zero inflated Poisson model using the DIC. Since the DIC of the ZINB is smaller than that of the ZIP, the ZINB model shows superior performance over the ZIP model in zero inflated count data with overdispersion.

Variational Bayesian multinomial probit model with Gaussian process classification on mice protein expression level data (가우시안 과정 분류에 대한 변분 베이지안 다항 프로빗 모형: 쥐 단백질 발현 데이터에의 적용)

  • Donghyun Son;Beom Seuk Hwang
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.115-127
    • /
    • 2023
  • Multinomial probit model is a popular model for multiclass classification and choice model. Markov chain Monte Carlo (MCMC) method is widely used for estimating multinomial probit model, but its computational cost is high. However, it is well known that variational Bayesian approximation is more computationally efficient than MCMC, because it uses subsets of samples. In this study, we describe multinomial probit model with Gaussian process classification and how to employ variational Bayesian approximation on the model. This study also compares the results of variational Bayesian multinomial probit model to the results of naive Bayes, K-nearest neighbors and support vector machine for the UCI mice protein expression level data.

Semi-Supervised Learning by Gaussian Mixtures (정규 혼합분포를 이용한 준지도 학습)

  • Choi, Byoung-Jeong;Chae, Youn-Seok;Choi, Woo-Young;Park, Chang-Yi;Koo, Ja-Yong
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.5
    • /
    • pp.825-833
    • /
    • 2008
  • Discriminant analysis based on Gaussian mixture models, an useful tool for multi-class classifications, can be extended to semi-supervised learning. We consider a model selection problem for a Gaussian mixture model in semi-supervised learning. More specifically, we adopt Bayesian information criterion to determine the number of subclasses in the mixture model. Through simulations, we illustrate the usefulness of the criterion.

Labor Market and Business Cycles in Korea: Bayesian Estimation of a Business Cycle Model with Labor Market Frictions (노동시장과 경기변동: 노동시장 마찰을 도입한 경기변동 모형의 베이지안 추정을 중심으로)

  • Lee, Junhee
    • Economic Analysis
    • /
    • v.26 no.4
    • /
    • pp.39-64
    • /
    • 2020
  • Typical business cycle models have difficulties in explaining key macroeconomic labor market variables, such as employment and unemployment, as they usually consider labor hour choices only. In this paper, we introduce labor market search and matching frictions into a New Keynesian nominal rigidity model and estimate it by Bayesian methods to examine the dynamics of the key labor market variables and business cycles in Korea. The results show that unemployment rates are largely explained by technology shocks, which affect the labor demand side, as well as labor supply shocks. In addition, wage bargaining shocks originating from the bargaining process between firms and workers have non-negligible negative effects on output and employment growth, and careful measures need to be taken to limit their adverse effects.

Bayesian Computation for Superposition of MUSA-OKUMOTO and ERLANG(2) processes (MUSA-OKUMOTO와 ERLANG(2)의 중첩과정에 대한 베이지안 계산 연구)

  • 최기헌;김희철
    • The Korean Journal of Applied Statistics
    • /
    • v.11 no.2
    • /
    • pp.377-387
    • /
    • 1998
  • A Markov Chain Monte Carlo method with data augmentation is developed to compute the features of the posterior distribution. For each observed failure epoch, we introduced latent variables that indicates with component of the Superposition model. This data augmentation approach facilitates specification of the transitional measure in the Markov Chain. Metropolis algorithms along with Gibbs steps are proposed to preform the Bayesian inference of such models. for model determination, we explored the Pre-quential conditional predictive Ordinate(PCPO) criterion that selects the best model with the largest posterior likelihood among models using all possible subsets of the component intensity functions. To relax the monotonic intensity function assumptions, we consider in this paper Superposition of Musa-Okumoto and Erlang(2) models. A numerical example with simulated dataset is given.

  • PDF

The Bayesian Analysis for Software Reliability Models Based on NHPP (비동질적 포아송과정을 사용한 소프트웨어 신뢰 성장모형에 대한 베이지안 신뢰성 분석에 관한 연구)

  • Lee, Sang-Sik;Kim, Hee-Cheul;Kim, Yong-Jae
    • The KIPS Transactions:PartD
    • /
    • v.10D no.5
    • /
    • pp.805-812
    • /
    • 2003
  • This paper presents a stochastic model for the software failure phenomenon based on a nonhomogeneous Poisson process (NHPP) and performs Bayesian inference using prior information. The failure process is analyzed to develop a suitable mean value function for the NHPP; expressions are given for several performance measure. The parametric inferences of the model using Logarithmic Poisson model, Crow model and Rayleigh model is discussed. Bayesian computation and model selection using the sum of squared errors. The numerical results of this models are applied to real software failure data. Tools of parameter inference was used method of Gibbs sampling and Metropolis algorithm. The numerical example by T1 data (Musa) was illustrated.

The Bayesian Inference for Software Reliability Models Based on NHPP (NHPP에 기초한 소프트웨어 신뢰도 모형에 대한 베이지안 추론에 관한 연구)

  • Lee, Sang-Sik;Kim, Hui-Cheol;Song, Yeong-Jae
    • The KIPS Transactions:PartD
    • /
    • v.9D no.3
    • /
    • pp.389-398
    • /
    • 2002
  • Software reliability growth models are used in testing stages of software development to model the error content and time intervals between software failures. This paper presents a stochastic model for the software failure phenomenon based on a nonhomogeneous Poisson process(NHPP) and performs Bayesian inference using prior information. The failure process is analyzed to develop a suitable mean value function for the NHPP ; expressions are given for several performance measure. Actual software failure data are compared with several model on the constant reflecting the quality of testing. The performance measures and parametric inferences of the suggested models using Rayleigh distribution and Laplace distribution are discussed. The results of the suggested models are applied to real software failure data and compared with Goel model. Tools of parameter point inference and 95% credible intereval was used method of Gibbs sampling. In this paper, model selection using the sum of the squared errors was employed. The numerical example by NTDS data was illustrated.