• Title/Summary/Keyword: rule-based classifier

Search Result 65, Processing Time 0.024 seconds

The Design of Pattern Classification based on Fuzzy Combined Polynomial Neural Network (퍼지 결합 다항식 뉴럴 네트워크 기반 패턴 분류기 설계)

  • Rho, Seok-Beom;Jang, Kyung-Won;Ahn, Tae-Chon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.4
    • /
    • pp.534-540
    • /
    • 2014
  • In this paper, we propose a fuzzy combined Polynomial Neural Network(PNN) for pattern classification. The fuzzy combined PNN comes from the generic TSK fuzzy model with several linear polynomial as the consequent part and is the expanded version of the fuzzy model. The proposed pattern classifier has the polynomial neural networks as the consequent part, instead of the general linear polynomial. PNNs are implemented by stacking the simple polynomials dynamically. To implement one layer of PNNs, the various types of simple polynomials are used so that PNNs have flexibility and versatility. Although the structural complexity of the implemented PNNs is high, the PNNs become a high order-multi input polynomial finally. To estimate the coefficients of a polynomial neuron, The weighted linear discriminant analysis. The output of fuzzy rule system with PNNs as the consequent part is the linear combination of the output of several PNNs. To evaluate the classification ability of the proposed pattern classifier, we make some experiments with several machine learning data sets.

Intelligent and Robust Face Detection

  • Park, Min-sick;Park, Chang-woo;Kim, Won-ha;Park, Mignon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.7
    • /
    • pp.641-648
    • /
    • 2001
  • A face detection in color images is important for many multimedia applications. It is first step for face recognition and can be used for classifying specific shorts. This paper describes a new method to detect faces in color images based on the skin color and hair color. This paper presents a fuzzy-based method for classifying skin color region in a complex background under varying illumination. The Fuzzy rule bases of the fuzzy system are generated using training method like a genetic algorithm(GA). We find the skin color region and hair color region using the fuzzy system and apply the convex-hull to each region and find the face from their intersection relationship. To validity the effectiveness of the proposed method, we make experiment with various cases.

  • PDF

Fuzzy Training Based on Segmentation Using Spatial Region Growing

  • Lee Sang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.20 no.5
    • /
    • pp.353-359
    • /
    • 2004
  • This study proposes an approach to unsupervisedly estimate the number of classes and the parameters of defining the classes in order to train the classifier. In the proposed method, the image is segmented using a spatial region growing based on hierarchical clustering, and fuzzy training is then employed to find the sample classes that well represent the ground truth. For cluster validation, this approach iteratively estimates the class-parameters in the fuzzy training for the sample classes and continuously computes the log-likelihood ratio of two consecutive class-numbers. The maximum ratio rule is applied to determine the optimal number of classes. The experimental results show that the new scheme proposed in this study could be used to select the regions with different characteristics existed on the scene of observed image as an alternative of field survey that is so expensive.

A Study on the Rule-Based Selection of Trainging Set for the Classification of Satellite Imagery (위성 영상 분류를 위한 규칙 기반 훈련 집합 선택에 관한 연구)

  • Um, Gi-Mun;Lee, Kwae-Hi
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.7
    • /
    • pp.1763-1772
    • /
    • 1996
  • The conventional training set selection methods for the satellite image classification usually depend on the manual selection using data from the direct measurements of the ground or the ground map. However this task takes much time and cost, and some feature values vary in wide ranges even if they are in the same class. Such feature values can increase the robustness of the neural net but learning time becomes longer. In this paper,we propose anew training set selection algorithm using a rule-based method. By the technique proposed, the SPOT multispectral Imagery is classified in 3 bands, and the pixels which satisfy the rule are employed as the training sets for the neutralist classifier. The experimental results show faster initial convergence and almost the same or better classification accuracy. We also showed an improvement of the classification accuracy by using texture features and NDV1.

  • PDF

Hierarchical Automatic Classification of News Articles based on Association Rules (연관규칙을 이용한 뉴스기사의 계층적 자동분류기법)

  • Joo, Kil-Hong;Shin, Eun-Young;Lee, Joo-Il;Lee, Won-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.6
    • /
    • pp.730-741
    • /
    • 2011
  • With the development of the internet and computer technology, the amount of information through the internet is increasing rapidly and it is managed in document form. For this reason, the research into the method to manage for a large amount of document in an effective way is necessary. The conventional document categorization method used only the keywords of related documents for document classification. However, this paper proposed keyword extraction method of based on association rule. This method extracts a set of related keywords which are involved in document's category and classifies representative keyword by using the classification rule proposed in this paper. In addition, this paper proposed the preprocessing method for efficient keywords creation and predicted the new document's category. We can design the classifier and measure the performance throughout the experiment to increase the profile's classification performance. When predicting the category, substituting all the classification rules one by one is the major reason to decrease the process performance in a profile. Finally, this paper suggested automatically categorizing plan which can be applied to hierarchical category architecture, extended from simple category architecture.

A Two-Dimensional Binary Prefix Tree for Packet Classification (패킷 분류를 위한 이차원 이진 프리픽스 트리)

  • Jung, Yeo-Jin;Kim, Hye-Ran;Lim, Hye-Sook
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.4
    • /
    • pp.543-550
    • /
    • 2005
  • Demand for better services in the Internet has been increasing due to the rapid growth of the Internet, and hence next generation routers are required to perform intelligent packet classification. For a given classifier defining packet attributes or contents, packet classification is the process of identifying the highest priority rule to which a packet conforms. A notable characteristic of real classifiers is that a packet matches only a small number of distinct source-destination prefix pairs. Therefore, a lot of schemes have been proposed to filter rules based on source and destination prefix pairs. However, most of the schemes are based on sequential one-dimensional searches using trio which requires huge memory. In this paper, we proposea memory-efficient two-dimensional search scheme using source and destination prefix pairs. By constructing binary prefix tree, source prefix search and destination prefix search are simultaneously performed in a binary tree. Moreover, the proposed two-dimensional binary prefix tree does not include any empty internal nodes, and hence memory waste of previous trio-based structures is completely eliminated.

Utilizing Various Natural Language Processing Techniques for Biomedical Interaction Extraction

  • Park, Kyung-Mi;Cho, Han-Cheol;Rim, Hae-Chang
    • Journal of Information Processing Systems
    • /
    • v.7 no.3
    • /
    • pp.459-472
    • /
    • 2011
  • The vast number of biomedical literature is an important source of biomedical interaction information discovery. However, it is complicated to obtain interaction information from them because most of them are not easily readable by machine. In this paper, we present a method for extracting biomedical interaction information assuming that the biomedical Named Entities (NEs) are already identified. The proposed method labels all possible pairs of given biomedical NEs as INTERACTION or NO-INTERACTION by using a Maximum Entropy (ME) classifier. The features used for the classifier are obtained by applying various NLP techniques such as POS tagging, base phrase recognition, parsing and predicate-argument recognition. Especially, specific verb predicates (activate, inhibit, diminish and etc.) and their biomedical NE arguments are very useful features for identifying interactive NE pairs. Based on this, we devised a twostep method: 1) an interaction verb extraction step to find biomedically salient verbs, and 2) an argument relation identification step to generate partial predicate-argument structures between extracted interaction verbs and their NE arguments. In the experiments, we analyzed how much each applied NLP technique improves the performance. The proposed method can be completely improved by more than 2% compared to the baseline method. The use of external contextual features, which are obtained from outside of NEs, is crucial for the performance improvement. We also compare the performance of the proposed method against the co-occurrence-based and the rule-based methods. The result demonstrates that the proposed method considerably improves the performance.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

A Text Detection Method Using Wavelet Packet Analysis and Unsupervised Classifier

  • Lee, Geum-Boon;Odoyo Wilfred O.;Kim, Kuk-Se;Cho, Beom-Joon
    • Journal of information and communication convergence engineering
    • /
    • v.4 no.4
    • /
    • pp.174-179
    • /
    • 2006
  • In this paper we present a text detection method inspired by wavelet packet analysis and improved fuzzy clustering algorithm(IAFC).This approach assumes that the text and non-text regions are considered as two different texture regions. The text detection is achieved by using wavelet packet analysis as a feature analysis. The wavelet packet analysis is a method of wavelet decomposition that offers a richer range of possibilities for document image. From these multi scale features, we adapt the improved fuzzy clustering algorithm based on the unsupervised learning rule. The results show that our text detection method is effective for document images scanned from newspapers and journals.

A study of broad board classification of korean digits using symbol processing (심볼을 이용한 한국어 숫자음의 광역 음소군 분류에 관한 연구)

  • Lee, Bong-Gu;Lee, Guk;Hhwang, Hee-Yoong
    • Proceedings of the KIEE Conference
    • /
    • 1989.07a
    • /
    • pp.481-485
    • /
    • 1989
  • The object of this parer is on the design of an broad board classifier for connected. Korean digit. Many approaches have been applied in speech recognition systems: parametric vector quantization, dynamic programming and hiden Markov model. In the 80's the neural network method, which is expected to solve complex speech recognition problems, came bach. We have chosen the rule based system for our model. The phoneme-groups that we wish to classify are vowel_like, plosive_like fricative_like, and stop_like.The data used are 1380 connected digits spoken by three untrained male speakers. We have seen 91.5% classification rate.

  • PDF