• Title/Summary/Keyword: rule-discovery

Search Result 92, Processing Time 0.028 seconds

The Method of Rule Discovery for Time Series Data (시 계열 데이터에서의 연관성 발견을 위한 기법)

  • 이준호;차재혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.607-609
    • /
    • 2004
  • 본 논문은 시 계열 데이터에서의 연관성 발견에 있어서 복잡성과 연산량을 효과적으로 줄이며 연관성을 찾아내는 기법에 대해 기술한다. 기존의 시 계열 데이터에서의 sequence 분할 방법은 복잡한 clustering 기법을 사용하여 많은 시간과 resource를 필요로 하는 제한이 있다 이에 본 논문에서는 효과적인 sequence 분할을 위한 증감 table을 이용한 방법을 제안하였다.

  • PDF

웹 페이지 방문 시간을 고려한 연관 규칙 탐색

  • Gang, Hyeong-Chang;Kim, Ik-Chan;Kim, Cheol-Su
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.05a
    • /
    • pp.263-269
    • /
    • 2005
  • Users who use Web site wish to get information conveniently. To users who web site operators use Web site differentiation to provide done service pattern analysis by user do must. Association rule is one of data Mining techniques for pattern discovery. If search for pattern by user, differentiation by user done service offer can. Association rule search result that pattern by user can know, and considers web page visiting time for association rule search differentiation done web structure service and recommendation service possible.

  • PDF

An Empirical Study of Qualities of Association Rules from a Statistical View Point

  • Dorn, Maryann;Hou, Wen-Chi;Che, Dunren;Jiang, Zhewei
    • Journal of Information Processing Systems
    • /
    • v.4 no.1
    • /
    • pp.27-32
    • /
    • 2008
  • Minimum support and confidence have been used as criteria for generating association rules in all association rule mining algorithms. These criteria have their natural appeals, such as simplicity; few researchers have suspected the quality of generated rules. In this paper, we examine the rules from a more rigorous point of view by conducting statistical tests. Specifically, we use contingency tables and chi-square test to analyze the data. Experimental results show that one third of the association rules derived based on the support and confidence criteria are not significant, that is, the antecedent and consequent of the rules are not correlated. It indicates that minimum support and minimum confidence do not provide adequate discovery of meaningful associations. The chi-square test can be considered as an enhancement or an alternative solution.

Association Rule Discovery Considering Strategic Importance: WARM (전략적 중요도를 고려한 연관규칙의 발견: WARM)

  • Choi, Doug-Won
    • The KIPS Transactions:PartD
    • /
    • v.17D no.4
    • /
    • pp.311-316
    • /
    • 2010
  • This paper presents a weight adjusted association rule mining algorithm (WARM). Assigning weights to each strategic factor and normalizing raw scores within each strategic factor are the key ideas of the presented algorithm. It is an extension of the earlier algorithm TSAA (transitive support association Apriori) and strategic importance is reflected by considering factors such as profit, marketing value, and customer satisfaction of each item. Performance analysis based on a real world database has been made and comparison of the mining outcomes obtained from three association rule mining algorithms (Apriori, TSAA, and WARM) is provided. The result indicates that each algorithm gives distinct and characteristic behavior in association rule mining.

Prediction of Implicit Protein - Protein Interaction Using Optimal Associative Feature Rule (최적 연관 속성 규칙을 이용한 비명시적 단백질 상호작용의 예측)

  • Eom, Jae-Hong;Zhang, Byoung-Tak
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.4
    • /
    • pp.365-377
    • /
    • 2006
  • Proteins are known to perform a biological function by interacting with other proteins or compounds. Since protein interaction is intrinsic to most cellular processes, prediction of protein interaction is an important issue in post-genomic biology where abundant interaction data have been produced by many research groups. In this paper, we present an associative feature mining method to predict implicit protein-protein interactions of Saccharomyces cerevisiae from public protein interaction data. We discretized continuous-valued features by maximal interdependence-based discretization approach. We also employed feature dimension reduction filter (FDRF) method which is based on the information theory to select optimal informative features, to boost prediction accuracy and overall mining speed, and to overcome the dimensionality problem of conventional data mining approaches. We used association rule discovery algorithm for associative feature and rule mining to predict protein interaction. Using the discovered associative feature we predicted implicit protein interactions which have not been observed in training data. According to the experimental results, the proposed method accomplished about 96.5% prediction accuracy with reduced computation time which is about 29.4% faster than conventional method with no feature filter in association rule mining.

Learning of Adaptive Behavior of artificial Ant Using Classifier System (분류자 시스템을 이용한 인공개미의 적응행동의 학습)

  • 정치선;심귀보
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.10a
    • /
    • pp.361-367
    • /
    • 1998
  • The main two applications of the Genetic Algorithms(GA) are the optimization and the machine learning. Machine Learning has two objectives that make the complex system learn its environment and produce the proper output of a system. The machine learning using the Genetic Algorithms is called GA machine learning or genetic-based machine learning (GBML). The machine learning is different from the optimization problems in finding the rule set. In optimization problems, the population of GA should converge into the best individual because optimization problems, the population of GA should converge into the best individual because their objective is the production of the individual near the optimal solution. On the contrary, the machine learning systems need to find the set of cooperative rules. There are two methods in GBML, Michigan method and Pittsburgh method. The former is that each rule is expressed with a string, the latter is that the set of rules is coded into a string. Th classifier system of Holland is the representative model of the Michigan method. The classifier systems arrange the strength of classifiers of classifier list using the message list. In this method, the real time process and on-line learning is possible because a set of rule is adjusted on-line. A classifier system has three major components: Performance system, apportionment of credit system, rule discovery system. In this paper, we solve the food search problem with the learning and evolution of an artificial ant using the learning classifier system.

  • PDF

DISCOVERY TEMPORAL FREQUENT PATTERNS USING TFP-TREE

  • Jin Long;Lee Yongmi;Seo Sungbo;Ryu Keun Ho
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.454-457
    • /
    • 2005
  • Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns. And calendar based on temporal association rules proposes the discovery of association rules along with their temporal patterns in terms of calendar schemas, but this approach is also adopt an Apriori-like candidate set generation. In this paper, we propose an efficient temporal frequent pattern mining using TFP-tree (Temporal Frequent Pattern tree). This approach has three advantages: (1) this method separates many partitions by according to maximum size domain and only scans the transaction once for reducing the I/O cost. (2) This method maintains all of transactions using FP-trees. (3) We only have the FP-trees of I-star pattern and other star pattern nodes only link them step by step for efficient mining and the saving memory. Our performance study shows that the TFP-tree is efficient and scalable for mining, and is about an order of magnitude faster than the Apriori algorithm and also faster than calendar based on temporal frequent pattern mining methods.

  • PDF

Active Learning Environment for the Heritage of Korean Modern Architecture: a Blended-Space Approach

  • Jang, Sun-Young;Kim, Sung-Ah
    • International Journal of Contents
    • /
    • v.12 no.4
    • /
    • pp.8-16
    • /
    • 2016
  • This research proposes the composition logic of an Active Learning Environment (ALE), to enable discovery by learning through experience, whilst increasing knowledge about modern architectural heritage. Linking information to the historical heritage using Information and Communication Technology (ICT) helps to overcome the limits of previous learning methods, by providing rich learning resources on site. Existing field trips of cultural heritages are created to impart limited experience content from web resources, or receive content at a specific place through humanities Geographic Information System (GIS). Therefore, on the basis of the blended space theory, an augmented space experience method for overcoming these shortages was composed. An ALE space framework is proposed to enable discovery through learning in an expanded space. The operation of ALE space is needed to create full coordination, such as a Content Management System (CMS). It involves a relation network to provide knowledge to the rule engine of the CMS. The application is represented with the Deoksugung Palace Seokjojeon hall example, by describing a user experience scenario.

Association Rule Mining Algorithm and Analysis of Missing Values

  • Lee, Jae-Wan;Bobby D. Gerardo;Kim, Gui-Tae;Jeong, Jin-Seob
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.3
    • /
    • pp.150-156
    • /
    • 2003
  • This paper explored the use of an algorithm for the data mining and method in handling missing data which had generated enhanced association patterns observed using the data illustrated here. The evaluations showed that more association patterns are generated in the second analysis which suggests more meaningful rules than in the first situation. It showed that the model offer more precise and important association rules that is more valuable when applied for business decision making. With the discovery of accurate association rules or business patterns, strategies could be efficiently planned out and implemented to improve marketing schemes. This investigation gives rise to a number of interesting issues that could be explored further like the effect of outliers and missing data for detecting fraud and devious database entries.

Adaptation Methods for a Probabilistic Fuzzy Rule-based Learning System (확률적 퍼지 룰 기반 학습 시스템의 적응 방법)

  • Lee, Hyeong-Uk;Byeon, Jeung-Nam
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.223-226
    • /
    • 2007
  • 지식 발견 (knowledge discovery)의 관점에서, 단기간 동안 취득된 데이터 패턴을 학습하고자 하는 경우 데이터에 비일관적인(inconsistent) 패턴이 포함되어 있다면 확률적 퍼지 룰(probabilistic fuzzy rule) 기반의 지식 표현 방법 및 적절한 학습 알고리즘을 이용하여 효과적으로 다룰 수 있다. 하지만 장기간 동안 지속적으로 얻어진 데이터 패턴을 다루고자 하는 경우, 데이터가 시변(time-varying) 특성을 가지고 있으면 기존에 추출된 지식을 변화된 데이터에 활용하기 어렵게 된다. 때문에 이러한 데이터를 다루는 학습 시스템에는 패턴의 변화에 맞추어 갈 수 있는 지속적인 적응력(adaptivity)이 요구된다. 본 논문에서는 이러한 적응성의 측면을 고려하여 평생 학습(life-long learning)의 관점 에 서 확률적 퍼지 룰 기반의 학습 시스템에 적용될 수 있는 두 가지 형태의 적응 방법에 대해서 설명하도록 한다.

  • PDF