• Title/Summary/Keyword: Gibbs Sampling method

Search Result 80, Processing Time 0.028 seconds

Analysis on Topic Trends and Topic Modeling of KSHSM Journal Papers using Text Mining (텍스트마이닝을 활용한 보건의료산업학회지의 토픽 모델링 및 토픽트렌드 분석)

  • Cho, Kyoung-Won;Bae, Sung-Kwon;Woo, Young-Woon
    • The Korean Journal of Health Service Management
    • /
    • v.11 no.4
    • /
    • pp.213-224
    • /
    • 2017
  • Objectives : The purpose of this study was to analyze representative topics and topic trends of papers in Korean Society and Health Service Management(KSHSM) Journal. Methods : We collected English abstracts and key words of 516 papers in KSHSM Journal from 2007 to 2017. We utilized Python web scraping programs for collecting the papers from Korea Citation Index web site, and RStudio software for topic analysis based on latent Dirichlet allocation algorithm. Results : 9 topics were decided as the best number of topics by perplexity analysis and the resultant 9 topics for all the papers were extracted using Gibbs sampling method. We could refine 9 topics to 5 topics by deep consideration of meanings of each topics and analysis of intertopic distance map. In topic trends analysis from 2007 to 2017, we could verify 'Health Management' and 'Hospital Service' were two representative topics, and 'Hospital Service' was prevalent topic by 2011, but the ratio of the two topics became to be similar from 2012. Conclusions : We discovered 5 topics were the best number of topics and the topic trends reflected the main issues of KSHSM Journal, such as name revision of the society in 2012.

An analysis of indoor environment research trends in Korea using topic modeling : Case study on abstracts from the journal of the Korean society for indoor environment (토픽모델링을 활용한 실내환경 분야 연구동향 파악 : 실내환경학회지 초록 사례연구)

  • Jeon, Hyung Jin;Kim, Do Youn;Han, Kook Jin;Kim, Dong Woo;Son, Seung Woo;Lee, Cheol Min
    • Journal of odor and indoor environment
    • /
    • v.17 no.4
    • /
    • pp.322-329
    • /
    • 2018
  • The objective of this study is to identify the research trend in the field of indoor environment in Korea. We collected 419 papers published in the Journal of the Korean Society for indoor environment between 2004 and 2018, and attempted to produce datasets using a topic modeling technique, Latent Dirichlet Allocation(LDA). The result of topic modeling showed that 8 topics ("VOCs investigation", "Subway environment", "Building thermal environment", "School health", "Building particulate matter", "Asbestos risk", "Radon risk", "Air cleaner and treatment") could be extracted using Gibbs sampling method. In terms of topic trends, investigation of volatile organic compounds, subway environment, school health, and building particulate matter showed a decreasing tendency, while the building thermal environment, asbestos risk, radon risk, air cleaners, and air treatment showed an increasing tendency. The results of this topic modeling could help us to understand current trends related indoor environment, and provide valuable information in developing future research and policy frameworks.

A Nonstationary Frequency Analysis of Extreme Wind Speed in Jeju using Bayesian Approach (베이지안 기법을 이용한 제주지역 극치풍속의 비정상성 빈도해석)

  • Kim, Kyoungmin;Kwon, Hyun-Han;Kwon, Soon-Duck
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.6
    • /
    • pp.667-673
    • /
    • 2019
  • Global warming may accelerate climate change and may increase disaster caused by strong winds. This research studied a method for a nonstationary frequency analysis considering the linear trend over time. The Bayesian method was used to estimate the posterior distribution of the parameters for the extreme value distribution of the annual maximum wind speed at Jeju Airport. The nonstationary frequency analysis was performed based on the Monte Carlo Markov Chain simulation and the Gibbs sampling. The estimated wind speeds by nonstationary frequency analysis was larger than those by stationary analysis. The conventional frequency analysis procedure assuming stationarity is likely to underestimate the future design wind speed in the region where statistically significant trend exists.

Bayesian analysis of latent factor regression model (내재된 인자회귀모형의 베이지안 분석법)

  • Kyung, Minjung
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.4
    • /
    • pp.365-377
    • /
    • 2020
  • We discuss latent factor regression when constructing a common structure inherent among explanatory variables to solve multicollinearity and use them as regressors to construct a linear model of a response variable. Bayesian estimation with LASSO prior of a large penalty parameter to construct a significant factor loading matrix of intrinsic interests among infinite latent structures. The estimated factor loading matrix with estimated other parameters can be inversely transformed into linear parameters of each explanatory variable and used as prediction models for new observations. We apply the proposed method to Product Service Management data of HBAT and observe that the proposed method constructs the same factors of general common factor analysis for the fixed number of factors. The calculated MSE of predicted values of Bayesian latent factor regression model is also smaller than the common factor regression model.

A Comparison of Bayesian and Maximum Likelihood Estimations in a SUR Tobit Regression Model (SUR 토빗회귀모형에서 베이지안 추정과 최대가능도 추정의 비교)

  • Lee, Seung-Chun;Choi, Byongsu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.991-1002
    • /
    • 2014
  • Both Bayesian and maximum likelihood methods are efficient for the estimation of regression coefficients of various Tobit regression models (see. e.g. Chib, 1992; Greene, 1990; Lee and Choi, 2013); however, some researchers recognized that the maximum likelihood method tends to underestimate the disturbance variance, which has implications for the estimation of marginal effects and the asymptotic standard error of estimates. The underestimation of the maximum likelihood estimate in a seemingly unrelated Tobit regression model is examined. A Bayesian method based on an objective noninformative prior is shown to provide proper estimates of the disturbance variance as well as other regression parameters

A Study on Bayesian Approach of Software Stochastic Reliability Superposition Model using General Order Statistics (일반 순서 통계량을 이용한 소프트웨어 신뢰확률 중첩모형에 관한 베이지안 접근에 관한 연구)

  • Lee, Byeong-Su;Kim, Hui-Cheol;Baek, Su-Gi;Jeong, Gwan-Hui;Yun, Ju-Yong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2060-2071
    • /
    • 1999
  • The complicate software failure system is defined to the superposition of the points of failure from several component point process. Because the likelihood function is difficulty in computing, we consider Gibbs sampler using iteration sampling based method. For each observed failure epoch, we applied to latent variables that indicates with component of the superposition mode. For model selection, we explored the posterior Bayesian criterion and the sum of relative errors for the comparison simple pattern with superposition model. A numerical example with NHPP simulated data set applies the thinning method proposed by Lewis and Shedler[25] is given, we consider Goel-Okumoto model and Weibull model with GOS, inference of parameter is studied. Using the posterior Bayesian criterion and the sum of relative errors, as we would expect, the superposition model is best on model under diffuse priors.

  • PDF

Estimation of Genetic Parameters for Calving Ease by Heifers and Cows Using Multi-trait Threshold Animal Models with Bayesian Approach

  • Lee, D.H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.15 no.8
    • /
    • pp.1085-1090
    • /
    • 2002
  • Genetic parameters for birth weights (BWT), calving ease scores observed from calves born by heifers (CEH), and calving ease scores observed from calves born by cows (CEC) were estimated using Bayesian methodology with Gibbs sampling in different threshold animal models. Data consisted of 77,458 records for calving ease scores and birth weights in Gelbvieh cattle. Gibbs samplers were used to obtain the parameters of interest for the categorical traits in two univariate threshold animal models, a bivariate threshold animal model, and a three-trait linear-threshold animal model. Samples of heritabilities and genetic correlations were calculated from the posterior means of dispersion parameters. In a univariate threshold animal model with CEH (model 1), the posterior means of heritabilities for calving ease was 0.35 for direct genetic effects and 0.18 for maternal genetic effects. In the other univariate threshold model with CEC (model 2), the posterior means of heritabilities of CEC was 0.28 for direct genetic effects and 0.18 for maternal genetic effects. In a bivariate threshold model with CEH and CEC (model 3), heritability estimates were similar to those in unvariate threshold models. In this model, genetic correlation between heifer calving ease and cow calving ease was 0.89 and 0.87 for direct genetic effect and maternal genetic effects, respectively. In a three-trait animal model, which contained two categorical traits (CEH and CEC) and one continuous trait (BWT) (model 4), heritability estimates of CEH and CEC for direct (maternal) genetic effects were 0.40 (0.23) and 0.23 (0.13), respectively. In this model, genetic correlation estimates between CEH and CEC were 0.89 and 0.66 for direct genetic effects and maternal effects, respectively. These estimates were greater than estimates between BWT and CEH (0.82 and 0.34) or BWT and CEC (0.85 and 0.26). This result indicates that CEH and CEC should be high correlated rather than estimates between calving ease and birth weight. Genetic correlation estimates between direct genetic effects and maternal effects were -0.29, -0.31 and 0.15 for BWT, CEH and CEC, respectively. Correlation for permanent environmental effects between BWT and CEC was -0.83 in model 4. This study can provide genetic evaluation for calving ease with other continuous traits jointly with assuming that calving ease from first calving was a same trait to calving ease from later parities calving. Further researches for reliability of dispersion parameters would be needed even if the more correlated traits would be concerned in the model, the higher reliability could be obtained, especially on threshold model with property that categorical traits have little information.

Estimation of Environmental Effect and Genetic Parameter on Reproduction Traits for On-farm Test Records (농장검정돈의 번식형질에 미치는 환경효과 및 유전모수의 추정)

  • Jung, D.J.;Kim, B.W.;Roh, S.H.;Kim, H.S.;Moon, W.K.;Kim, H.Y.;Jang, H.G.;Choi, L.S.;Jeon, J.T.;Lee, J.G.
    • Journal of Animal Science and Technology
    • /
    • v.50 no.1
    • /
    • pp.33-44
    • /
    • 2008
  • The purpose of this study was to estimate the genetic parameters and trend of Landrace and Yorkshire pigs, which were raised on private farms from 1999 to 2005 and tested for their reproductive performance by the Korea Animal Improvement Association. Prior to analysis, records without pedigree or having value with larger than±3×standard deviation for the Total number of born were excluded. The effects of breed and environmental factors were estimated with least square method(Harvey, 1979), and estimation of breeding values and genetic parameters were performed on the data of 1’st litter only with GIBBSF90(Misztal, 2001) which was programmed according to Gibbs Sampling method based on Bayesian Inference by Gianola and Fernando(1986), Jensen(1994) and others. Gibbs sampling was performed 50,000 times for each parameter, and the first 5000 samples were regarded as those in burn-in period and thus, excluded for post hoc analysis. Total number of born and total number of accident were statistically significant(p<0.01) for the breed, farrowing year, farrowing season and parity effects, and the number born alive at birth was statistically significantp<(0.01) for the breed, farrowing year, farrowing season and parity effects. No particular trend was observed in the genetic and phenotypic improvement of the total number of born and number born alive at birth before 2001, when the piglet registration system started, but the tendencies of increasing for the total number of born and number born alive and decreasing for the total number of accident were observed since 2001. Somewhat higher heritability estimates of our study seems to be attributed to the situations that first parity records with poor farrowing performances were used in the analyses and it was impossible to obtain accurate reproductive performance due to the absence of criteria for record keeping at the level of individual farms.

Genetic correlations between behavioural responses and performance traits in laying hens

  • Rozempolska-Rucinska, Iwona;Zieba, Grzegorz;Kibala, Lucyna;Prochniak, Tomasz;Lukaszewicz, Marek
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.12
    • /
    • pp.1674-1678
    • /
    • 2017
  • Objective: The aim of the study was to evaluate genetic correlations between the behavioural profile and performance in laying hens as an indirect answer to the question whether the observed behavioural responses are associated with increased levels of stress in these birds. Methods: The assessment of birds' temperament was carried out using the novel objects test. The behavioural test was conducted in two successive generations comprising 9,483 Rhode Island White (RIW) birds (approx. 4,700 individuals per generation) and 4,326 Rhode Island Red (RIR) birds (approx. 2,100 individuals per generation). Based on the recorded responses, the birds were divided into two groups: a fearful profile (1,418 RIW hens and 580 RIR hens) and a brave/curious profile (8,065 RIW hens and 3,746 RIR hens). The birds were subjected to standard assessment of their performance traits, including SM, age at sexual maturity; ST, shell thickness; SG, egg specific gravity; EW, mean egg weight; IP, initial egg production; and HC, number of hatched chicks. The pedigree was three generations deep (including two behaviourrecorded generations). Estimation of the (co)variance components was performed with the Gibbs sampling method, which accounts for the discrete character of the behavioural profile denotation. Results: The analyses revealed negative correlations between the performance traits of the laying hens and the behavioural profile defined as fearful. In the group of fearful RIW birds, delayed sexual maturation (0.22) as well as a decrease in the initial egg production (-0.30), egg weight (-0.54), egg specific gravity (-0.331), shell thickness (-0.11), and the number of hatched chicks (-0.24) could be expected. These correlations were less pronounced in the RIR breed, in which the fearful birds exhibited a decline in hatchability (-0.37), egg specific gravity (-0.11), and the number of hatched chicks (-0.18). There were no correlations in the case of the other traits or they were positive but exhibited a substantial standard error, as for the egg weight. Conclusion: To sum up the results obtained, it can be noted that behavioural responses indicating fearfulness, i.e. escape, avoidance, and approach-avoidance may reflect negative emotions experienced by birds. The negative correlations with performance in the group of fearful hens may indirectly indicate a high level of stress in these birds, especially in the white-feathered birds, where stronger performance-fearfulness correlations were found. Fearful birds should be eliminated from breeding by inclusion of the behavioural profile in the selection criterion in the case of laying hens.

The Analysis of Changes in East Coast Tourism using Topic Modeling (토핑 모델링을 활용한 동해안 관광의 변화 분석)

  • Jeong, Eun-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.6
    • /
    • pp.489-495
    • /
    • 2020
  • The amount of data is increasing through various IT devices in a hyper-connected society where the 4th revolution is progressing, and new value can be created by analyzing that data. This paper was collected total 1,526 articles from 2017 to 2019 in central magazines, economic magazines, regional associations, and major broadcasting companies with the keyword "(East Coast Tourism or East Coast Travel) and Gangwon-do" through Bigkinds. It was performed the topic modeling using LDA algorithm implemented in the R language to analyze the collected 1,526 articles. It was extracted keywords for each year from 2017 to 2019, and classified and compared keywords with high frequency for each year. It was setted the optimal number of topics to 8 using Log Likelihood and Perplexity, and then inferred 8 topics using the Gibbs Sampling method. The inferred topics were Gangneung and Beach, Goseong and Mt.Geumgang, KTX and Donghae-Bukbu line, weekend sea tour, Sokcho and Unification Observatory, Yangyang and Surfing, experience tour, and transportation network infra. The changes of articles on East coast tourism was was analyzed using the proportion of the inferred eight topics. As the result, the proportion of Unification Observatory and Mt. Geumgang showed no significant change, the proportion of KTX and experience tour increased, and the proportion of other topics decreased in 2018 compared to 2017. In 2019, the proportion of KTX and experience tour decreased, but the proportion of other topics showed no significant change.