• Title/Summary/Keyword: 규칙정확도

Search Result 289, Processing Time 0.025 seconds

Statistical Information-Based Hierarchical Fuzzy-Rough Classification Approach (통계적 정보기반 계층적 퍼지-러프 분류기법)

  • Son, Chang-S.;Seo, Suk-T.;Chung, Hwan-M.;Kwon, Soon-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.6
    • /
    • pp.792-798
    • /
    • 2007
  • In this paper, we propose a hierarchical fuzzy-rough classification method based on statistical information for maximizing the performance of pattern classification and reducing the number of rules without learning approaches such as neural network, genetic algorithm. In the proposed method, statistical information is used for extracting the partition intervals of antecedent fuzzy sets at each layer on hierarchical fuzzy-rough classification systems and rough sets are used for minimizing the number of fuzzy if-then rules which are associated with the partition intervals extracted by statistical information. To show the effectiveness of the proposed method, we compared the classification results(e.g. the classification accuracy and the number of rules) of the proposed with those of the conventional methods on the Fisher's IRIS data. From the experimental results, we can confirm the fact that the proposed method considers only statistical information of the given data is similar to the classification performance of the conventional methods.

Learning Rules for Identifying Hypernyms in Machine Readable Dictionaries (기계가독형사전에서 상위어 판별을 위한 규칙 학습)

  • Choi Seon-Hwa;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.171-178
    • /
    • 2006
  • Most approaches for extracting hypernyms of a noun from its definitions in an MRD rely on lexical patterns compiled by human experts. Not only these approaches require high cost for compiling lexical patterns but also it is very difficult for human experts to compile a set of lexical patterns with a broad-coverage because in natural languages there are various expressions which represent same concept. To alleviate these problems, this paper proposes a new method for extracting hypernyms of a noun from its definitions in an MRD. In proposed approach, we use only syntactic (part-of-speech) patterns instead of lexical patterns in identifying hypernyms to reduce the number of patterns with keeping their coverage broad. Our experiment has shown that the classification accuracy of the proposed method is 92.37% which is significantly much better than that of previous approaches.

Deriving rules for identifying diabetic among individuals with metabolic syndrome (대사증후군 환자 가운데 당뇨환자를 찾기 위한 규칙 도출)

  • Choi, Jinwook;Suh, Yongmoo
    • Journal of Digital Convergence
    • /
    • v.16 no.11
    • /
    • pp.363-372
    • /
    • 2018
  • The objective of this study is to derive specific classification rules that could be used to prevent individuals with Metabolic Syndrome (MS) from developing diabetes. Specifically, we aim to identify rules which classify individuals with MS into those without diabetes (class 0) and those with diabetes (class 1). In this study we collected data from Korean National Health and Nutrition Examination Survey and built a decision tree after data pre-processing. The decision tree brings about five useful rules and their average classification accuracy is quite high (75.8%). In addition, the decision tree showed that high blood pressure and waist circumference are the most influential factors on the classification of the two groups. Our research results will serve as good guidelines for clinicians to provide better treatment for patients with MS, such that they do not develop diabetes.

Rule-based Normalization of Relative Temporal Information

  • Jeong, Young-Seob;Lim, Chaegyun;Lee, SeungDong;Mswahili, Medard Edmund;Ndomba, Goodwill Erasmo;Choi, Ho-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.41-49
    • /
    • 2022
  • Documents often contain relative time expressions, and it is important to define a schema of the relative time information and develop a system that extracts such information from corpus. In this study, to deal with the relative time expressions, we propose seven additional attributes of timex3: year, month, day, week, hour, minute, and second. We propose a way to represent normalized values of the relative time expressions such as before, after, and count, and also design a set of rules to extract the relative time information from texts. With a new corpus constructed using the new attributes that consists of dialog, news, and history documents, we observed that our rule-set generally achieved 70% accuracy on the 1,041 documents. Especially, with the most frequently appeared attributes such as year, day, and week, we got higher accuracies compared to other attributes. The results of this study, our proposed timex3 attributes and the rule-set, will be useful in the development of services such as question-answer systems and chatbots.

Efficient Image Retrieval using Minimal Spatial Relationships (최소 공간관계를 이용한 효율적인 이미지 검색)

  • Lee, Soo-Cheol;Hwang, Een-Jun;Byeon, Kwang-Jun
    • Journal of KIISE:Databases
    • /
    • v.32 no.4
    • /
    • pp.383-393
    • /
    • 2005
  • Retrieval of images from image databases by spatial relationship can be effectively performed through visual interface systems. In these systems, the representation of image with 2D strings, which are derived from symbolic projections, provides an efficient and natural way to construct image index and is also an ideal representation for the visual query. With this approach, retrieval is reduced to matching two symbolic strings. However, using 2D-string representations, spatial relationships between the objects in the image might not be exactly specified. Ambiguities arise for the retrieval of images of 3D scenes. In order to remove ambiguous description of object spatial relationships, in this paper, images are referred by considering spatial relationships using the spatial location algebra for the 3D image scene. Also, we remove the repetitive spatial relationships using the several reduction rules. A reduction mechanism using these rules can be used in query processing systems that retrieve images by content. This could give better precision and flexibility in image retrieval.

Predicting Plant Biological Environment Using Intelligent IoT (지능형 사물인터넷을 이용한 식물 생장 환경 예측)

  • Ko, Sujeong
    • Journal of Digital Contents Society
    • /
    • v.19 no.7
    • /
    • pp.1423-1431
    • /
    • 2018
  • IoT(Internet of Things) is applied to technologies such as agriculture and dairy farming, making it possible to cultivate crops easily and easily in cities.In particular, IoT technology that intelligently judge and control the growth environment of cultivated crops in the agricultural field is being developed. In this paper, we propose a method of predicting the growth environment of plants by learning the moisture supply cycle of plants using the intelligent object internet. The proposed system finds the moisture level of the soil moisture by mapping learning and finds the rules that require moisture supply based on the measured moisture level. Based on these rules, we predicted the moisture supply cycle and output it using media, so that it is convenient for users to use. In addition, in order to reduce the error of the value measured by the sensor, the information of each plant is exchanged with each other, so that the accuracy of the prediction is improved while compensating the value when there is an error. In order to evaluate the performance of the growth environment prediction system, the experiment was conducted in summer and winter and it was verified that the accuracy was high.

Classification of Ovarian Cancer Microarray Data based on Intelligent Systems with Marker gene (선별 시스템 기반 표지 유전자를 포함한 난소암 마이크로어레이 데이터 분류)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.3
    • /
    • pp.747-752
    • /
    • 2011
  • Microarray classification typically possesses two striking attributes: (1) classifier design and error estimation are based on remarkably small samples and (2) cross-validation error estimation is employed in the majority of the papers. A Microarray data of ovarian cancer consists of the expressions of thens of thousands of genes, and there is no systematic procedure to analyze this information instantaneously. In this paper, gene markers are selected by ranking genes according to statistics, popular classification rules - linear discriminant analysis, k-nearest-neighbor and decision trees - has been performed comparing classification accuracy of data selecting gene markers and not selecting gene markers. The Result that apply linear classification analysis at Microarray data set including marker gene that are selected using ANOVA method represent the highest classification accuracy of 97.78% and the lowest prediction error estimate.

Implementation and Experimental Results of Neural Network and Genetic Algorithm based Spam Filtering Technique (신경망과 운전자 알고리즘을 이용한 스팸 메일 필터링 기법에 구현과 성능평가)

  • Kim Bum-Bae;Choi Hyoung-Kee
    • The KIPS Transactions:PartC
    • /
    • v.13C no.2 s.105
    • /
    • pp.259-266
    • /
    • 2006
  • As the volume of spam has increased to extreme levels, many anti-spam filtering techniques have been proposed. Among these techniques, the machine-Loaming filtering technique is one of the most popular filtering techniques. In this paper, we propose a machine-learning spam filtering technique based on the neural network, the genetic algorithm and the $X^2$-statistic. This proposed filtering technique is designed to overcome the problems in existing filtering techniques, and to achieve high spam filtering accuracy. It is able to classify spam and legitimate emil with 95.25 percent and 95.31 percent accuracy. This accuracy of the sum filtering is 7.75 percent and the 12.44 percent higher than rule-based filtering and the Bayesian filtering technique, respectively.

Smarter Classification for Imbalanced Data Set and Its Application to Patent Evaluation (불균형 데이터 집합에 대한 스마트 분류방법과 특허 평가에의 응용)

  • Kwon, Ohbyung;Lee, Jonathan Sangyun
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.15-34
    • /
    • 2014
  • Overall, accuracy as a performance measure does not fully consider modular accuracy: the accuracy of classifying 1 (or true) as 1 is not same as classifying 0 (or false) as 0. A smarter classification algorithm would optimize the classification rules to match the modular accuracies' goals according to the nature of problem. Correspondingly, smarter algorithms must be both more generalized with respect to the nature of problems, and free from decretization, which may cause distortion of the real performance. Hence, in this paper, we propose a novel vertical boosting algorithm that improves modular accuracies. Rather than decretizing items, we use simple classifiers such as a regression model that accepts continuous data types. To improve the generalization, and to select a classification model that is well-suited to the nature of the problem domain, we developed a model selection algorithm with smartness. To show the soundness of the proposed method, we performed an experiment with a real-world application: predicting the intellectual properties of e-transaction technology, which had a 47,000+ record data set.

Korean Probabilistic Syntactic Model using Head Co-occurrence (중심어 간의 공기정보를 이용한 한국어 확률 구문분석 모델)

  • Lee, Kong-Joo;Kim, Jae-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.9B no.6
    • /
    • pp.809-816
    • /
    • 2002
  • Since a natural language has inherently structural ambiguities, one of the difficulties of parsing is resolving the structural ambiguities. Recently, a probabilistic approach to tackle this disambiguation problem has received considerable attention because it has some attractions such as automatic learning, wide-coverage, and robustness. In this paper, we focus on Korean probabilistic parsing model using head co-occurrence. We are apt to meet the data sparseness problem when we're using head co-occurrence because it is lexical. Therefore, how to handle this problem is more important than others. To lighten the problem, we have used the restricted and simplified phrase-structure grammar and back-off model as smoothing. The proposed model has showed that the accuracy is about 84%.