• 제목/요약/키워드: 베이지언

Search Result 52, Processing Time 0.024 seconds

Improving Multinomial Naive Bayes Text Classifier (다항시행접근 단순 베이지안 문서분류기의 개선)

  • 김상범;임해창
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.259-267
    • /
    • 2003
  • Though naive Bayes text classifiers are widely used because of its simplicity, the techniques for improving performances of these classifiers have been rarely studied. In this paper, we propose and evaluate some general and effective techniques for improving performance of the naive Bayes text classifier. We suggest document model based parameter estimation and document length normalization to alleviate the Problems in the traditional multinomial approach for text classification. In addition, Mutual-Information-weighted naive Bayes text classifier is proposed to increase the effect of highly informative words. Our techniques are evaluated on the Reuters21578 and 20 Newsgroups collections, and significant improvements are obtained over the existing multinomial naive Bayes approach.

Mapping the Geographic Variations of the Low Birth Weight cases in South Korea: Bayesian Approaches (우리나라 저체중아 출생의 공간적 변동성 지도화: 베이지언적 접근)

  • Roh, Young-hee;Park, Key-ho
    • Journal of the Korean Geographical Society
    • /
    • v.51 no.3
    • /
    • pp.367-380
    • /
    • 2016
  • This study reviewed and compared methods for mapping aggregated low birth weight (LBW) and geographic variations in LBW in South Korea. Based on this review, we produced LBW maps in South Korea. Standardized mortality/morbidity ratios (SMRs) and crude mortality rates have been widely used for many years in epidemiological research. However, SMR-based maps are likely to be affected by sample size of unit area. Therefore, this study adopted a model-based approach using Bayesian estimates to reduce noisy variability in the SMR. By using a Bayesian model, we can calculate a statistically reliable RR values. We used the full Bayes estimator, as well as empirical Bayes estimators. As a result, variations in the two Bayes models were similar. The SMR-based statistics had the largest variation. The result maps can be used to identify regions with a high risk of LBW in South Korea.

  • PDF

Automatic Text Categorization by Term Weighting and Inverted Category Frequency (용어 가중치와 역범주 빈도에 의한 자동문서 범주화)

  • Lee, Kyung-Chan;Kang, Seung-Shik
    • Annual Conference on Human and Language Technology
    • /
    • 2003.10d
    • /
    • pp.14-17
    • /
    • 2003
  • 문서의 확률을 이용하여 자동으로 문서를 분류하는 문서 범주화 기법의 대표적인 방법이 나이브 베이지언 확률 모델이다. 이 방법의 기본 형식은 출현 용어의 확률 계산 방법이다. 하지만 실제 문서 범주화 과정에서 출현하지 않는 용어들도 성능에 많은 영향을 줄 수 있으며, 출현 용어들에 대한 빈도 이외의 역범주 빈도나 용어가중치를 적용하여 문서 범주화 시스템의 성능을 향상시킬 수 있다. 본 논문에서는 나이브 베이지언 확률 모델에 출현 용어와 출현하지 않는 용어들에 대한 smoothing 기법을 적용하여 실험하였다. 성능 평가를 위해 뉴스그룹 문서들을 이용하였으며, 역범주 빈도와 가중치를 적용했을 때 나이브 베이지언 확률 모델에 비해 약 7% 정도 성능 개선 효과가 있었다.

  • PDF

Stochastic Fatigue Life Assesment based on Bayesian-inference (베이지언 추론에 기반한 확률론적 피로수명 평가)

  • Park, Myong-Jin;Kim, Yooil
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.56 no.2
    • /
    • pp.161-167
    • /
    • 2019
  • In general, fatigue analysis is performed by using deterministic model to estimate the optimal parameters. However, the deterministic model is difficult to clearly describe the physical phenomena of fatigue failure that contains many uncertainty factors. With regard to this, efforts have been made in this research to compare with the deterministic model and the stochastic models. Firstly, One deterministic S-N curve was derived from ordinary least squares technique and two P-S-N curves were estimated through Bayesian-linear regression model and Markov-Chain Monte Carlo simulation. Secondly, the distribution of Long-term fatigue damage and fatigue life were predicted by using the parameters obtained from the three methodologies and the long-term stress distribution.

Comparing the Impacts of Renewable Energy Policies on the Macroeconomy with Electricity Market Rigidities: A Bayesian DSGE Model (전력시장의 경직성에 따른 국가 재생에너지 정책이 거시경제에 미치는 영향 분석: 베이지언 DSGE 모형 접근)

  • Choi, Bongseok;Kim, Kihwan
    • Environmental and Resource Economics Review
    • /
    • v.31 no.3
    • /
    • pp.367-391
    • /
    • 2022
  • We develop an energy-economy Bayesian DSGE model with the two sectors of electricity generations-traditional (fossil, nuclear) and renewable energy. Under imperfect substitutability between the two sectors, a technological shock on renewable energy sectors does not sufficient to facilitate energy conversion and reduce greenhouse gas emissions. Technology innovation on greenhouse gas emission reduction is also required. More importantly, sufficient investment should be derived by a well-functioning electricity market where electricity price plays a signal role in efficient allocation of resources. Indeed, market rigidities cause reduced consumption.

A Method of Selecting Test Metrics for Certifying Package Software using Bayesian Belief Network (베이지언 사용한 패키지 소프트웨어 인증을 위한 시험 메트릭 선택 기법)

  • Lee, Chong-Won;Lee, Byung-Jeong;Oh, Jae-Won;Wu, Chi-Su
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.10
    • /
    • pp.836-850
    • /
    • 2006
  • Nowadays, due to the rapidly increasing number of package software products, quality test has been emphasized for package software products. When testing software products, one of the most important factors is to select metrics which form the bases for tests. In this paper, the types of package software are represented as characteristic vectors having probabilistic relationships with metrics. The characteristic vectors could be regarded as indicators of software type. To assign the metrics for each software type, the past test metrics are collected and analyzed. Using Bayesian belief network, the dependency relationship network of the characteristic vectors and metrics is constructed. The dependency relationship network is then used to find the proper metrics for the test of new package software products.

A Bayesian Approach to Stereo Matching via Merging Watershed Regions (워터쉐드 영역병합을 이용한 스테레오 정합의 베이지언 접근방법)

  • Kil, Woo-Sung;Kim, Shin-Hyung;Jang, Jong-Whan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.809-812
    • /
    • 2005
  • 본 논문은 세그멘테이션 기반의 스테레오 정합에서 복잡한 장면 정합 시 발생되는 오 정합을 최소화 하는 방법을 제안한다. 이를 위하여, 스테레오 영상의 좌측 영상에 대해 워터쉐드 영상 분할을 이용하여 정합을 위한 feature 를 생성한 다음, 베이지언 프레임웍을 적용하여, 각각의 영역을 비슷한 변이 정보를 가진 것들로 병합한다. 생성되는 정합 패치들은 정합의 모호성이 작게 되어 오 정합이 현저히 줄어 들 뿐만 아니라, 영역간의 콘트라스트가 적은 영상에서도 신뢰할 만한 변이 영상을 생성하게 된다.

  • PDF

Development of an Adaptive e-Learning System for Engineering Mathematics using Computer Algebra and Bayesian Inference Network (컴퓨터 대수와 베이지언 추론망을 이용한 이공계 수학용 적응적 e-러닝 시스템 개발)

  • Park, Hong-Joon;Jun, Young-Cook
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.5
    • /
    • pp.276-286
    • /
    • 2008
  • In this paper, we introduce an adaptive e-Learning system for engineering mathematics which is based on computer algebra system (Mathematica) and on-line authoring environment. The system provides an assessment tool for individual diagnosis using Bayesian inference network. Using this system, an instructor can easily develop mathematical web contents via web interface. Examples of such content development are illustrated in the area of linear algebra, differential equation and discrete mathematics. The diagnostic module traces a student's knowledge level based on statistical inference using the conditional probability and Bayesian updating algorithm via Netica. As part of formative evaluation, we brought this system into real university settings and analyzed students' feedback using survey.

The Risk Assessment and Prediction for the Mixed Deterioration in Cable Bridges Using a Stochastic Bayesian Modeling (확률론적 베이지언 모델링에 의한 케이블 교량의 복합열화 리스크 평가 및 예측시스템)

  • Cho, Tae Jun;Lee, Jeong Bae;Kim, Seong Soo
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.16 no.5
    • /
    • pp.29-39
    • /
    • 2012
  • The main objective is to predict the future degradation and maintenance budget for a suspension bridge system. Bayesian inference is applied to find the posterior probability density function of the source parameters (damage indices and serviceability), given ten years of maintenance data. The posterior distribution of the parameters is sampled using a Markov chain Monte Carlo method. The simulated risk prediction for decreased serviceability conditions are posterior distributions based on prior distribution and likelihood of data updated from annual maintenance tasks. Compared with conventional linear prediction model, the proposed quadratic model provides highly improved convergence and closeness to measured data in terms of serviceability, risky factors, and maintenance budget for bridge components, which allows forecasting a future performance and financial management of complex infrastructures based on the proposed quadratic stochastic regression model.

A Clinical Nomogram Construction Method Using Genetic Algorithm and Naive Bayesian Technique (유전자 알고리즘과 나이브 베이지언 기법을 이용한 의료 노모그램 생성 방법)

  • Lee, Keon-Myung;Kim, Won-Jae;Yun, Seok-Jung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.796-801
    • /
    • 2009
  • In medical practice, the diagnosis or prediction models requiring complicated computations are not widely recognized due to difficulty in interpreting the course of reasoning and the complexity of computations. Medical personnel have used the nomograms which are a graphical representation for numerical relationships that enables to easily compute a complicated function without help of computation machines. It has been widely paid attention in diagnosing diseases or predicting the progress of diseases. A nomogram is constructed from a set of clinical data which contain various attributes such as symptoms, lab experiment results, therapy history, progress of diseases or identification of diseases. It is of importance to select effective ones from available attributes, sometimes along with parameters accompanying the attributes. This paper introduces a nomogram construction method that uses a naive Bayesian technique to construct a nomogram as well as a genetic algorithm to select effective attributes and parameters. The proposed method has been applied to the construction of a nomogram for a real clinical data set.