• Title/Summary/Keyword: Bayesian Classification

Search Result 253, Processing Time 0.028 seconds

An Automatic Classification System of Korean Documents Using Weight for Keywords of Document and Word Cluster (문서의 주제어별 가중치 부여와 단어 군집을 이용한 한국어 문서 자동 분류 시스템)

  • Hur, Jun-Hui;Choi, Jun-Hyeog;Lee, Jung-Hyun;Kim, Joong-Bae;Rim, Kee-Wook
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.447-454
    • /
    • 2001
  • The automatic document classification is a method that assigns unlabeled documents to the existing classes. The automatic document classification can be applied to a classification of news group articles, a classification of web documents, showing more precise results of Information Retrieval using a learning of users. In this paper, we use the weighted Bayesian classifier that weights with keywords of a document to improve the classification accuracy. If the system cant classify a document properly because of the lack of the number of words as the feature of a document, it uses relevance word cluster to supplement the feature of a document. The clusters are made by the automatic word clustering from the corpus. As the result, the proposed system outperformed existing classification system in the classification accuracy on Korean documents.

  • PDF

Intelligence Package Development for UT Signal Pattern Recognition and Application to Classification of Defects in Austenitic Stainless Steel Weld (UT 신호형상 인식을 위한 Intelligence Package 개발과 Austenitic Stainless Steel Welding부 결함 분류에 관한 적용 연구)

  • Lee, Kang-Yong;Kim, Joon-Seob
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.15 no.4
    • /
    • pp.531-539
    • /
    • 1996
  • The research for the classification of the artificial defects in welding parts is performed using the pattern recognition technology of ultrasonic signal. The signal pattern recognition package including the user defined function is developed to perform the digital signal processing, feature extraction, feature selection and classifier selection. The neural network classifier and the statistical classifiers such as the linear discriminant function classifier and the empirical Bayesian classifier are compared and discussed. The pattern recognition technique is applied to the classification of artificial defects such as notchs and a hole. If appropriately learned, the neural network classifier is concluded to be better than the statistical classifiers in the classification of the artificial defects.

  • PDF

A Study on the Digital Signal Processing for the Pattern fiecognition of Weld Flaws (용접결함의 패턴인식을 위한 디지털 신호처리에 관한 연구)

  • 김재열;송찬일;김병현
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.393-396
    • /
    • 1995
  • In this syudy, the researches classifying the artificial and natural flaws in welding parts are performed using the smart pattern recognition technology. For this purpose the smart signal pattern recognition package including the user defined function was developed and the total procedure including the digital signal processing,feature extraction , feature selection and classifier selection is treated by bulk. Specially it is composed with and discussed using the statistical classifier such as the linear disciminant function classifier, the empirical Bayesian classifier. Also, the smart pattern recognition technology is applied to classification problem of natural flaw(i.e multiple classification problem-crack,lack of penetration,lack of fusion,porosity,and slag inclusion, the planar and volumetric flaw classification problem). According to this results, if appropriately learned the neural network classifier is better than ststistical classifier in the classification problem of natural flaw. And it is possible to acquire the recognition rate of 80% above through it is different a little according to domain extracting the feature and the classifier.

  • PDF

Land Cover Classification Techniques for Large Area using Digital Satellite Data (수치위성자료를 이용한 광역의 토지피복분류 기법)

  • 박병욱
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.14 no.1
    • /
    • pp.39-47
    • /
    • 1996
  • This paper is to provide land cover classification techniques for large area ranged in different pathos by classifying Landsat TM data of Jeonnam province. The analyses proceeded by individual scene because acquired dates are not same in different pathes. In this processing, troubles had happened something like variation of classes can be classified in two scenes and choice problem about overlapped area. Since spatial effects in large area affect data values, it was difficult to make a selection of classes and training fields. we could present a solution about these problems by trial and error method, and found that Bayesian maximum likelihood classification and majority filtering were effective to improve classification accuracy.

  • PDF

Comments Classification System using Topic Signature (Topic Signature를 이용한 댓글 분류 시스템)

  • Bae, Min-Young;Cha, Jeong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.12
    • /
    • pp.774-779
    • /
    • 2008
  • In this work, we describe comments classification system using topic signature. Topic signature is widely used for selecting feature in document classification and summarization. Comments are short and have so many word spacing errors, special characters. We firstly convert comments into 7-gram. We consider the 7-gram as sentence. We convert the 7-gram into 3-gram. We consider the 3-gram as word. We select key feature using topic signature and classify new inputs by the Naive Bayesian method. From the result of experiments, we can see that the proposed method is outstanding over the previous methods.

Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics

  • Chen, YongHeng;Zhang, Fuquan;Zuo, WanLi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.392-412
    • /
    • 2018
  • Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More specifically, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods.

Molecular Phylogeny of the Subfamily Tephritinae (Diptera: Tephritidae) Based on Mitochondrial 16S rDNA Sequences

  • Han, Ho-Yeon;Ro, Kyung-Eui;McPheron, Bruce A.
    • Molecules and Cells
    • /
    • v.22 no.1
    • /
    • pp.78-88
    • /
    • 2006
  • The phylogeny of the subfamily Tephritinae (Diptera: Tephritidae) was reconstructed from mitochondrial 16S ribosomal RNA gene sequences using 53 species representing 11 currently recognized tribes of the Tephritinae and 10 outgroup species. The minimum evolution and Bayesian trees suggested the following phylogenetic relationships: (1) monophyly of the Tephritinae was strongly supported; (2) a sister group relationship between the Tephritinae and Plioreocepta was supported by the Bayesian tree; (3) the tribes Tephrellini, Myopitini, and Terelliini (excluding Neaspilota) were supported as monophyletic groups; (4) the non-monophyletic nature of the tribes Dithrycini, Eutretini, Noeetini, Tephritini, Cecidocharini, and Xyphosiini; and (5) recognition of 10 putative tribal groups, most of which were supported strongly by the statistical tests of the interior branches. Our results, therefore, convincingly suggest that an extensive rearrangement of the tribal classification of the Tephritinae is necessary. Since our sampling of taxa heavily relied on the current accepted classification, some lineages identified by the present study were severely under-sampled and other possible major lineages of the Tephritinae were probably not even represented in our dataset. We believe that our results provide baseline information for a more rigorous sampling of additional taxa representing all possible major lineages of the subfamily, which is essential for a comprehensive revision of the tephritine tribal classification.

New Inference for a Multiclass Gaussian Process Classification Model using a Variational Bayesian EM Algorithm and Laplace Approximation

  • Cho, Wanhyun;Kim, Sangkyoon;Park, Soonyoung
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.202-208
    • /
    • 2015
  • In this study, we propose a new inference algorithm for a multiclass Gaussian process classification model using a variational EM framework and the Laplace approximation (LA) technique. This is performed in two steps, called expectation and maximization. First, in the expectation step (E-step), using Bayes' theorem and the LA technique, we derive the approximate posterior distribution of the latent function, indicating the possibility that each observation belongs to a certain class in the Gaussian process classification model. In the maximization step, we compute the maximum likelihood estimators for hyper-parameters of a covariance matrix necessary to define the prior distribution of the latent function by using the posterior distribution derived in the E-step. These steps iteratively repeat until a convergence condition is satisfied. Moreover, we conducted the experiments by using synthetic data and Iris data in order to verify the performance of the proposed algorithm. Experimental results reveal that the proposed algorithm shows good performance on these datasets.

Classification of Very High Concerns HRCT Images using Extended Bayesian Networks (확장 베이지안망을 적용한 고위험성 HRCT 영상 분류)

  • Lim, Chae-Gyun;Jung, Yong-Gyu
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.2
    • /
    • pp.7-12
    • /
    • 2012
  • Recently the medical field to efficiently process the vast amounts of information to decision trees, neural networks, Bayesian Networks, including the application method of various data mining techniques are investigated. In addition, the basic personal information or patient history, family history, in addition to information such as MRI, HRCT images and additional information to collect and leverage in the diagnosis of disease, improved diagnostic accuracy is to promote a common status. But in real world situations that affect the results much because of the variable exists for a particular data mining techniques to obtain information through the enemy can be seen fairly limited. Medical images were taken as well as a minor can not give a positive impact on the diagnosis, but the proportion increased subjective judgments by the automated system is to deal with difficult issues. As a result of a complex reality, the situation is more advantageous to deal with the relative probability of the multivariate model based on Bayesian network, or TAN in the K2 search algorithm improves due to expansion model has been proposed. At this point, depending on the type of search algorithm applied significantly influenced the performance characteristics of the extended Bayesian network, the performance and suitability of each technique for evaluation of the facts is required. In this paper, we extend the Bayesian network for diagnosis of diseases using the same data were carried out, K2, TAN and changes in search algorithms such as classification accuracy was measured. In the 10-fold cross-validation experiment was performed to compare the performance evaluation based on the analysis and the onset of high-risk classification for patients with HRCT images could be possible to identify high-risk data.

Pattern Classification Using Hybrid Monte Carlo Neural Networks (변종 몬테 칼로 신경망을 이용한 패턴 분류)

  • Jeon, Seong-Hae;Choe, Seong-Yong;O, Im-Geol;Lee, Sang-Ho;Jeon, Hong-Seok
    • The KIPS Transactions:PartB
    • /
    • v.8B no.3
    • /
    • pp.231-236
    • /
    • 2001
  • 일반적인 다층 신경망에서 가중치의 갱신 알고리즘으로 사용하는 오류 역전과 방식은 가중치 갱신 결과를 고정된(fixed) 한 개의 값으로 결정한다. 이는 여러 갱신의 가능성을 오직 한 개의 값으로 고정하기 때문에 다양한 가능성들을 모두 수용하지 못하는 면이 있다. 하지만 모든 가능성을 확률적 분포로 표현하는 갱신 알고리즘을 도입하면 이런 문제는 해결된다. 이러한 알고리즘을 사용한 베이지안 신경망 모형(Bayesian Neural Networks Models)은 주어진 입력값(Input)에 대해 블랙 박스(Black-Box)와같은 신경망 구조의 각 층(Layer)을 거친 출력값(Out put)을 계산한다. 이 때 주어진 입력 데이터에 대한 결과의 예측값은 사후분포(posterior distribution)의 기댓값(mean)에 의해 계산할 수 있다. 주어진 사전분포(prior distribution)와 학습데이터에 의한 우도함수(likelihood functions)에 의해 계산한 사후확률의 함수는 매우 복잡한 구조를 가짐으로 기댓값의 적분계산에 대한 어려움이 발생한다. 따라서 수치해석적인 방법보다는 확률적 추정에 의한 근사 방법인 몬테 칼로 시뮬레이션을 이용할 수 있다. 이러한 방법으로서 Hybrid Monte Carlo 알고리즘은 좋은 결과를 제공하여준다(Neal 1996). 본 논문에서는 Hybrid Monte Carlo 알고리즘을 적용한 신경망이 기존의 CHAID, CART 그리고 QUEST와 같은 여러 가지 분류 알고리즘에 비해서 우수한 결과를 제공하는 것을 나타내고 있다.

  • PDF