• Title/Summary/Keyword: 베이지안 분류

Search Result 200, Processing Time 0.021 seconds

Recognition of Korean Vowels using Bayesian Classification with Mouth Shape (베이지안 분류 기반의 입 모양을 이용한 한글 모음 인식 시스템)

  • Kim, Seong-Woo;Cha, Kyung-Ae;Park, Se-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.8
    • /
    • pp.852-859
    • /
    • 2019
  • With the development of IT technology and smart devices, various applications utilizing image information are being developed. In order to provide an intuitive interface for pronunciation recognition, there is a growing need for research on pronunciation recognition using mouth feature values. In this paper, we propose a system to distinguish Korean vowel pronunciations by detecting feature points of lips region in images and applying Bayesian based learning model. The proposed system implements the recognition system based on Bayes' theorem, so that it is possible to improve the accuracy of speech recognition by accumulating input data regardless of whether it is speaker independent or dependent on small amount of learning data. Experimental results show that it is possible to effectively distinguish Korean vowels as a result of applying probability based Bayesian classification using only visual information such as mouth shape features.

Pattern Classification Using Hybrid Monte Carlo Neural Networks (변종 몬테 칼로 신경망을 이용한 패턴 분류)

  • Jeon, Seong-Hae;Choe, Seong-Yong;O, Im-Geol;Lee, Sang-Ho;Jeon, Hong-Seok
    • The KIPS Transactions:PartB
    • /
    • v.8B no.3
    • /
    • pp.231-236
    • /
    • 2001
  • 일반적인 다층 신경망에서 가중치의 갱신 알고리즘으로 사용하는 오류 역전과 방식은 가중치 갱신 결과를 고정된(fixed) 한 개의 값으로 결정한다. 이는 여러 갱신의 가능성을 오직 한 개의 값으로 고정하기 때문에 다양한 가능성들을 모두 수용하지 못하는 면이 있다. 하지만 모든 가능성을 확률적 분포로 표현하는 갱신 알고리즘을 도입하면 이런 문제는 해결된다. 이러한 알고리즘을 사용한 베이지안 신경망 모형(Bayesian Neural Networks Models)은 주어진 입력값(Input)에 대해 블랙 박스(Black-Box)와같은 신경망 구조의 각 층(Layer)을 거친 출력값(Out put)을 계산한다. 이 때 주어진 입력 데이터에 대한 결과의 예측값은 사후분포(posterior distribution)의 기댓값(mean)에 의해 계산할 수 있다. 주어진 사전분포(prior distribution)와 학습데이터에 의한 우도함수(likelihood functions)에 의해 계산한 사후확률의 함수는 매우 복잡한 구조를 가짐으로 기댓값의 적분계산에 대한 어려움이 발생한다. 따라서 수치해석적인 방법보다는 확률적 추정에 의한 근사 방법인 몬테 칼로 시뮬레이션을 이용할 수 있다. 이러한 방법으로서 Hybrid Monte Carlo 알고리즘은 좋은 결과를 제공하여준다(Neal 1996). 본 논문에서는 Hybrid Monte Carlo 알고리즘을 적용한 신경망이 기존의 CHAID, CART 그리고 QUEST와 같은 여러 가지 분류 알고리즘에 비해서 우수한 결과를 제공하는 것을 나타내고 있다.

  • PDF

A Study of using Emotional Features for Information Retrieval Systems (감정요소를 사용한 정보검색에 관한 연구)

  • Kim, Myung-Gwan;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.579-586
    • /
    • 2003
  • In this paper, we propose a novel approach to employ emotional features to document retrieval systems. Fine emotional features, such as HAPPY, SAD, ANGRY, FEAR, and DISGUST, have been used to represent Korean document. Users are allowed to use these features for retrieving their documents. Next, retrieved documents are learned by classification methods like cohesion factor, naive Bayesian, and, k-nearest neighbor approaches. In order to combine various approaches, voting method has been used. In addition, k-means clustering has been used for our experimentation. The performance of our approach proved to be better in accuracy than other methods, and be better in short texts rather than large documents.

Software Quality Classification using Bayesian Classifier (베이지안 분류기를 이용한 소프트웨어 품질 분류)

  • Hong, Euy-Seok
    • Journal of Information Technology Services
    • /
    • v.11 no.1
    • /
    • pp.211-221
    • /
    • 2012
  • Many metric-based classification models have been proposed to predict fault-proneness of software module. This paper presents two prediction models using Bayesian classifier which is one of the most popular modern classification algorithms. Bayesian model based on Bayesian probability theory can be a promising technique for software quality prediction. This is due to the ability to represent uncertainty using probabilities and the ability to partly incorporate expert's knowledge into training data. The two models, Na$\ddot{i}$veBayes(NB) and Bayesian Belief Network(BBN), are constructed and dimensionality reduction of training data and test data are performed before model evaluation. Prediction accuracy of the model is evaluated using two prediction error measures, Type I error and Type II error, and compared with well-known prediction models, backpropagation neural network model and support vector machine model. The results show that the prediction performance of BBN model is slightly better than that of NB. For the data set with ambiguity, although the BBN model's prediction accuracy is not as good as the compared models, it achieves better performance than the compared models for the data set without ambiguity.

Nearest-neighbor Rule based Prototype Selection Method and Performance Evaluation using Bias-Variance Analysis (최근접 이웃 규칙 기반 프로토타입 선택과 편의-분산을 이용한 성능 평가)

  • Shim, Se-Yong;Hwang, Doo-Sung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.10
    • /
    • pp.73-81
    • /
    • 2015
  • The paper proposes a prototype selection method and evaluates the generalization performance of standard algorithms and prototype based classification learning. The proposed prototype classifier defines multidimensional spheres with variable radii within class areas and generates a small set of training data. The nearest-neighbor classifier uses the new training set for predicting the class of test data. By decomposing bias and variance of the mean expected error value, we compare the generalization errors of k-nearest neighbor, Bayesian classifier, prototype selection using fixed radius and the proposed prototype selection method. In experiments, the bias-variance changing trends of the proposed prototype classifier are similar to those of nearest neighbor classifiers with all training data and the prototype selection rates are under 27.0% on average.

Design and Implementation of Web Mail Filtering Agent for Personalized Classification (개인화된 분류를 위한 웹 메일 필터링 에이전트)

  • Jeong, Ok-Ran;Cho, Dong-Sub
    • The KIPS Transactions:PartB
    • /
    • v.10B no.7
    • /
    • pp.853-862
    • /
    • 2003
  • Many more use e-mail purely on a personal basis and the pool of e-mail users is growing daily. Also, the amount of mails, which are transmitted in electronic commerce, is getting more and more. Because of its convenience, a mass of spam mails is flooding everyday. And yet automated techniques for learning to filter e-mail have yet to significantly affect the e-mail market. This paper suggests Web Mail Filtering Agent for Personalized Classification, which automatically manages mails adjusting to the user. It is based on web mail, which can be logged in any time, any place and has no limitation in any system. In case new mails are received, it first makes some personal rules in use of the result of observation ; and based on the personal rules, it automatically classifies the mails into categories according to the contents of mails and saves the classified mails in the relevant folders or deletes the unnecessary mails and spam mails. And, we applied Bayesian Algorithm using Dynamic Threshold for our system's accuracy.

Exploring the Feature Selection Method for Effective Opinion Mining: Emphasis on Particle Swarm Optimization Algorithms

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.11
    • /
    • pp.41-50
    • /
    • 2020
  • Sentimental analysis begins with the search for words that determine the sentimentality inherent in data. Managers can understand market sentimentality by analyzing a number of relevant sentiment words which consumers usually tend to use. In this study, we propose exploring performance of feature selection methods embedded with Particle Swarm Optimization Multi Objectives Evolutionary Algorithms. The performance of the feature selection methods was benchmarked with machine learning classifiers such as Decision Tree, Naive Bayesian Network, Support Vector Machine, Random Forest, Bagging, Random Subspace, and Rotation Forest. Our empirical results of opinion mining revealed that the number of features was significantly reduced and the performance was not hurt. In specific, the Support Vector Machine showed the highest accuracy. Random subspace produced the best AUC results.

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents (학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도)

  • Lee, Yong-Bae
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.325-332
    • /
    • 2014
  • It is generally accepted that classification accuracy is affected by the number of learning documents, but there are few studies that show how this influences automatic text classification. This study is focused on evaluating the deviation-based classification model which is developed recently for genre-based classification and comparing it to other classification algorithms with the changing number of training documents. Experiment results show that the deviation-based classification model performs with a superior accuracy of 0.8 from categorizing 7 genres with only 21 training documents. This exceeds the accuracy of Bayesian and SVM. The Deviation-based classification model obtains strong feature selection capability even with small number of training documents because it learns subject information within genre while other methods use different learning process.

A Machine Learning Approach to Web Image Classification (기계학습 기반의 웹 이미지 분류)

  • Cho, Soo-Sun;Lee, Dong-Woo;Han, Dong-Won;Hwang, Chi-Jung
    • The KIPS Transactions:PartB
    • /
    • v.9B no.6
    • /
    • pp.759-764
    • /
    • 2002
  • Although image occupies a large part of importance on the Web documents, there have not been many researches for analyzing and understanding it. Many Web images are used for carrying important information but others are not used for it. In this paper classify the Web images from presently served Web sites to erasable or non-erasable classes. based on machine learning methods. For this research, we have detected 16 special and rich features for Web images and experimented by using the Baysian and decision tree methods. As the results, F-measures of 87.09%, 82.72% were achived for each method and particularly, from the experiments to compare the effects of feature groups, it has proved that the added features on this study are very useful for Web image classification.

A Study of Short-Term Load Forecasting System Using Data Mining (데이터 마이닝을 이용한 단기 부하 예측 시스템 연구)

  • Joo, Young-Hoon;Jung, Keun-Ho;Kim, Do-Wan;Park, Jin-Bae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.2
    • /
    • pp.130-135
    • /
    • 2004
  • This paper presents a new design methods of the short-term load forecasting system (STLFS) using the data mining. The structure of the proposed STLFS is divided into two parts: the Takagi-Sugeno (T-S) fuzzy model-based classifier and predictor The proposed classifier is composed of the Gaussian fuzzy sets in the premise part and the linearized Bayesian classifier in the consequent part. The related parameters of the classifier are easily obtained from the statistic information of the training set. The proposed predictor takes form of the convex combination of the linear time series predictors for each inputs. The problem of estimating the consequent parameters is formulated by the convex optimization problem, which is to minimize the norm distance between the real load and the output of the linear time series estimator. The problem of estimating the premise parameters is to find the parameter value minimizing the error between the real load and the overall output. Finally, to show the feasibility of the proposed method, this paper provides the short-term load forecasting example.