• Title/Summary/Keyword: Naive

Search Result 703, Processing Time 0.022 seconds

An Efficient Algorithm for NaiveBayes with Matrix Transposition (행렬 전치를 이용한 효율적인 NaiveBayes 알고리즘)

  • Lee, Jae-Moon
    • The KIPS Transactions:PartB
    • /
    • v.11B no.1
    • /
    • pp.117-124
    • /
    • 2004
  • This paper proposes an efficient algorithm of NaiveBayes without loss of its accuracy. The proposed method uses the transposition of category vectors, and minimizes the computation of the probability of NaiveBayes. The proposed method was implemented on the existing framework of the text categorization, so called, AI::Categorizer and it was compared with the conventional NaiveBayes with the well-known data, Router-21578. The comparisons show that the proposed method outperforms NaiveBayes about two times with respect to the executing time.

An Information-theoretic Approach for Value-Based Weighting in Naive Bayesian Learning (나이브 베이시안 학습에서 정보이론 기반의 속성값 가중치 계산방법)

  • Lee, Chang-Hwan
    • Journal of KIISE:Databases
    • /
    • v.37 no.6
    • /
    • pp.285-291
    • /
    • 2010
  • In this paper, we propose a new paradigm of weighting methods for naive Bayesian learning. We propose more fine-grained weighting methods, called value weighting method, in the context of naive Bayesian learning. While the current weighting methods assign a weight to an attribute, we assign a weight to an attribute value. We develop new methods, using Kullback-Leibler function, for both value weighting and feature weighting in the context of naive Bayesian. The performance of the proposed methods has been compared with the attribute weighting method and general naive bayesian. The proposed method shows better performance in most of the cases.

A Study on Incremental Learning Model for Naive Bayes Text Classifier (Naive Bayes 문서 분류기를 위한 점진적 학습 모델 연구)

  • 김제욱;김한준;이상구
    • Proceedings of the Korea Database Society Conference
    • /
    • 2001.06a
    • /
    • pp.331-341
    • /
    • 2001
  • 본 논문에서는 Naive Bayes 문서 분류기를 위한 새로운 학습모델을 제안한다. 이 모델에서는 라벨이 없는 문서들의 집합으로부터 선택한 적은 수의 학습 문서들을 이용하여 문서 분류기를 재학습한다. 본 논문에서는 이러한 학습 방법을 따를 경우 작은 비용으로도 문서 분류기의 정확도가 크게 향상될 수 있다는 사실을 보인다. 이와 같이, 알고리즘을 통해 라벨이 없는 문서들의 집합으로부터 정보량이 큰 문서를 선택한 후, 전문가가 이 문서에 라벨을 부여하는 방식으로 학습문서를 결정하는 것을 selective sampling이라 한다. 본 논문에서는 이러한 selective sampling 문제를 Naive Bayes 문서 분류기에 적용한다. 제안한 학습 방법에서는 라벨이 없는 문서들의 집합으로부터 재학습 문서를 선택하는 기준 측정치로서 평균절대편차(Mean Absolute Deviation), 엔트로피 측정치를 사용한다. 실험을 통해서 제안한 학습 방법이 기존의 방법인 신뢰도(Confidence measure)를 이용한 학습 방법보다 Naive Bayes 문서 분류기의 성능을 더 많이 향상시킨다는 사실을 보인다.

  • PDF

A Novel Posterior Probability Estimation Method for Multi-label Naive Bayes Classification

  • Kim, Hae-Cheon;Lee, Jaesung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • A multi-label classification is to find multiple labels associated with the input pattern. Multi-label classification can be achieved by extending conventional single-label classification. Common extension techniques are known as Binary relevance, Label powerset, and Classifier chains. However, most of the extended multi-label naive bayes classifier has not been able to accurately estimate posterior probabilities because it does not reflect the label dependency. And the remaining extended multi-label naive bayes classifier has a problem that it is unstable to estimate posterior probability according to the label selection order. To estimate posterior probability well, we propose a new posterior probability estimation method that reflects the probability between all labels and labels efficiently. The proposed method reflects the correlation between labels. And we have confirmed through experiments that the extended multi-label naive bayes classifier using the proposed method has higher accuracy then the existing multi-label naive bayes classifiers.

Naive Bayes classifiers boosted by sufficient dimension reduction: applications to top-k classification

  • Yang, Su Hyeong;Shin, Seung Jun;Sung, Wooseok;Lee, Choon Won
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.603-614
    • /
    • 2022
  • The naive Bayes classifier is one of the most straightforward classification tools and directly estimates the class probability. However, because it relies on the independent assumption of the predictor, which is rarely satisfied in real-world problems, its application is limited in practice. In this article, we propose employing sufficient dimension reduction (SDR) to substantially improve the performance of the naive Bayes classifier, which is often deteriorated when the number of predictors is not restrictively small. This is not surprising as SDR reduces the predictor dimension without sacrificing classification information, and predictors in the reduced space are constructed to be uncorrelated. Therefore, SDR leads the naive Bayes to no longer be naive. We applied the proposed naive Bayes classifier after SDR to build a recommendation system for the eyewear-frames based on customers' face shape, demonstrating its utility in the top-k classification problem.

Learning Distribution Graphs Using a Neuro-Fuzzy Network for Naive Bayesian Classifier (퍼지신경망을 사용한 네이브 베이지안 분류기의 분산 그래프 학습)

  • Tian, Xue-Wei;Lim, Joon S.
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.409-414
    • /
    • 2013
  • Naive Bayesian classifiers are a powerful and well-known type of classifiers that can be easily induced from a dataset of sample cases. However, the strong conditional independence assumptions can sometimes lead to weak classification performance. Normally, naive Bayesian classifiers use Gaussian distributions to handle continuous attributes and to represent the likelihood of the features conditioned on the classes. The probability density of attributes, however, is not always well fitted by a Gaussian distribution. Another eminent type of classifier is the neuro-fuzzy classifier, which can learn fuzzy rules and fuzzy sets using supervised learning. Since there are specific structural similarities between a neuro-fuzzy classifier and a naive Bayesian classifier, the purpose of this study is to apply learning distribution graphs constructed by a neuro-fuzzy network to naive Bayesian classifiers. We compare the Gaussian distribution graphs with the fuzzy distribution graphs for the naive Bayesian classifier. We applied these two types of distribution graphs to classify leukemia and colon DNA microarray data sets. The results demonstrate that a naive Bayesian classifier with fuzzy distribution graphs is more reliable than that with Gaussian distribution graphs.

Naive Bayes Learning Algorithm based on Map-Reduce Programming Model (Map-Reduce 프로그래밍 모델 기반의 나이브 베이스 학습 알고리즘)

  • Kang, Dae-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.10a
    • /
    • pp.208-209
    • /
    • 2011
  • In this paper, we introduce a Naive Bayes learning algorithm for learning and reasoning in Map-Reduce model based environment. For this purpose, we use Apache Mahout to execute Distributed Naive Bayes on University of California, Irvine (UCI) benchmark data sets. From the experimental results, we see that Apache Mahout' s Distributed Naive Bayes algorithm is comparable to WEKA' s Naive Bayes algorithm in terms of performance. These results indicates that in the future Big Data environment, Map-Reduce model based systems such as Apache Mahout can be promising for machine learning usage.

  • PDF

Gradient Descent Approach for Value-Based Weighting (점진적 하강 방법을 이용한 속성값 기반의 가중치 계산방법)

  • Lee, Chang-Hwan;Bae, Joo-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.381-388
    • /
    • 2010
  • Naive Bayesian learning has been widely used in many data mining applications, and it performs surprisingly well on many applications. However, due to the assumption that all attributes are equally important in naive Bayesian learning, the posterior probabilities estimated by naive Bayesian are sometimes poor. In this paper, we propose more fine-grained weighting methods, called value weighting, in the context of naive Bayesian learning. While the current weighting methods assign a weight to each attribute, we assign a weight to each attribute value. We investigate how the proposed value weighting effects the performance of naive Bayesian learning. We develop new methods, using gradient descent method, for both value weighting and feature weighting in the context of naive Bayesian. The performance of the proposed methods has been compared with the attribute weighting method and general Naive bayesian, and the value weighting method showed better in most cases.

Accurate Intrusion Detection using n-Gram Augmented Naive Bayes (N-Gram 증강 나이브 베이스를 이용한 정확한 침입 탐지)

  • Kang, Dae-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.285-288
    • /
    • 2008
  • In many intrusion detection applications, n-gram approach has been widely applied. However, n-gram approach has shown a few problems including double counting of features. To address those problems, we applied n-gram augmented Naive Bayes directly to classify intrusive sequences and compared performance with those of Naive Bayes and Support Vector Machines (SVM) with n-gram features by the experiments on host-based intrusion detection benchmark data sets. Experimental results on the University of New Mexico (UNM) benchmark data sets show that the n-gram augmented method, which solves the problem of independence violation that happens when n-gram features are directly applied to Naive Bayes (i.e. Naive Bayes with n-gram features), yields intrusion detectors with higher accuracy than those from Naive Bayes with n-gram features and shows comparable accuracy to those from SVM with n-gram features.

  • PDF

Naive Theories about Gravity among Korean Students (한국 학생들의 중력현상에 관한 유년적 사고)

  • Chae, Dong-Hyun
    • Journal of The Korean Association For Science Education
    • /
    • v.12 no.2
    • /
    • pp.67-79
    • /
    • 1992
  • 학생들은 수업전 그들의 일상적인 경험, 직접적인 관찰, 문화적인 배경으로 인하여 자연 현상에 관하여 과학자가 지니는 과학적 사고(Scientific theories)와는 다른 유년적 사고(Naive theories)을 지니고 있다는 사실이 연구에 의하여 보고되어 왔다. 본 연구는 중력현상에 관한 학생들의 유년적 사고(Naive theories)를 조사한 것이다. 이의 대상은 국민학생 49명, 중학생 53명, 고동학교 49명으로 하였다. 연구 방법으로는 질문지법(Open-ended written questions)과 면접법(Interview)을 이용하였다. 연구 결과는 국민학생에서 고등학생에 이르기까지 중력현상에 관하여 유년적 사고(Naive theories)를 지니고 있는 것으로 밝혀졌다. 중력현상에 관한 유년적 사고(Naive theories)들은 중력 현상이 공기의 유무, 대기압, 마개의 유무, 물의 상태변화, 온도의 영향등으로 지배를 받는다고 설명하였으며, 이러한 생각은 고학년으로 갈수록 줄어들고 있음을 볼 수 있다. 질문지법(Open-ended written question)과 면접법(Interview)의 결과가 매우 일치하는 것으로 밝혀졌다.

  • PDF