• Title/Summary/Keyword: 연관 규칙 탐사

Search Result 132, Processing Time 0.031 seconds

Discovering Temporal Relation Considering the Weight of Events in Multidimensional Stream Data Environment (다차원 스트림 데이터 환경에서 이벤트 가중치를 고려한 시간 관계 탐사)

  • Kim, Jae-In;Kim, Dae-In;Song, Myung-Jin;Han, Dae-Young;Hwang, Bu-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.2
    • /
    • pp.99-110
    • /
    • 2010
  • An event means a flow which has a time attribute such as a symptom of patient. Stream data collected by sensors can be summarized as an interval event which has a time interval between the start-time point and the end-time point in multiple stream data environment. Most of temporal mining techniques have considered only the frequent events. However, these approaches may ignore the infrequent event even if it is important. In this paper, we propose a new temporal data mining that can find association rules for the significant temporal relation based on interval events in multidimensional stream data environment. Our method considers the weight of events and stream data on the sensing time point of abnormal events. And we can discover association rules on the significant temporal relation regardless of the occurrence frequency of events. The experimental analysis has shown that our method provide more useful knowledge than other conventional methods.

Data Mining Techniques for Analyzing Promoter Sequences (프로모터 염기서열 분석을 위한 데이터 마이닝 기법)

  • 김정자;이도헌
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2000.10a
    • /
    • pp.328-332
    • /
    • 2000
  • As DNA sequences have been known through the Genome project the techniques for dealing with molecule-level gene information are being made researches briskly. It is also urgent to develop new computer algorithms for making databases and analyzing it efficiently considering the vastness of the information for known sequences. In this respect, this paper studies the association rule search algorithms for finding out the characteristics shown by means of the association between promoter sequences and genes, which is one of the important research areas in molecular biology. This paper treat biological data, while previous search algorithms used transaction data. So, we design a transformed association nile algorithm that covers data types and biological properties. These research results will contribute to reducing the time and the cost for biological experiments by minimizing their candidates.

  • PDF

Data Mining Techniques for Analyzing Promoter Sequences (프로모터 염기서열 분석을 위한 데이터 마이닝 기법)

  • 김정자;이도헌
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.4
    • /
    • pp.739-744
    • /
    • 2000
  • As DNA sequences have been known through the Genome project the techniques for dealing with molecule-level gene information are being made researches briskly. It is also urgent to develop new computer algorithms for making databases and analyzing it efficiently considering the vastness of the information for known sequences. In this respect, this paper studies the association rule search algorithms for finding out the characteristics shown by means of the association between promoter sequences and genes, which is one of the important research areas in molecular biology. This paper treat biological data, while previous search algorithms used transaction data. So, we design a transformed association rule algorithm that covers data types and biological properties. These research results will contribute to reducing the time and the cost for biological experiments by minimizing their candidates.

  • PDF

Development and Application of An Adaptive Web Site Construction Algorithm (적응형 웹 사이트 구축을 위한 연관규칙 알고리즘 개발과 적용)

  • Choi, Yun-Hee;Jun, Woo-Chun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.423-432
    • /
    • 2009
  • Advances in information and communication technologies are changing our society greatly. In knowledge-based society, information can be obtained easily via communication tools such as web and e-mail. However, obtaining right and up-to-date information is difficult in spite of overflowing information. The concept of adaptive web site has been initiated recently. The purpose of the site is to provide information only users want out of tons of data gathered. In this paper, an algorithm is developed for adaptive web site construction. The proposed algorithm is based on association rules that are major principle in adaptive web site construction. The algorithm is constructed by analysing log data in web server and extracting meaning documents through finding behavior patterns of users. The proposed algorithm has the following characteristics. First, it is superior to existing algorithms using association rules in time complexity. Its superiority is proved theoretically. Second, the proposed algorithm is effective in space complexity. This is due to that it does not need any intermediate products except a linked list that is essential for finding frequent item sets.

An Interpretation of Interoperability Definitions Using Association Rules Discovery (연관성 규칙 탐사를 이용한 상호운용성 정의의 해석)

  • Heo, Hwan;Kim, Ja-Hee
    • The Journal of Society for e-Business Studies
    • /
    • v.16 no.2
    • /
    • pp.39-71
    • /
    • 2011
  • Lately, developing systems fully interoperable with others is considered an essential element for successful projects, as not only do e-commerce becomes ubiquitous but also distributed systems' paradigm spreads. However, since definitions of interoperability vary by viewpoints, it is still difficult to have the same understanding and evaluation criteria on interoperability. For instance, various interoperability parties in military use different definitions of interoperability, and its T&E is not conducted according to the definition, but only to levels of information exchange. In this paper, we proposed a new definition of interoperability as followsm First of all, we collected existing and various interoperability definitions, extracting key components in each of them. Second, we statistically analyzed those components and applied the association rules discovery in data mining. We compared existing interoperability definitions to ours. From this research, we found associations among the components from various definitions applying market-basketanalysis, redefining interoperability. Key findings of this research can contribute to a unified viewpoint on the definition, level, and evaluation items of interoperability.

Discovering Sequence Association Rules for Protein Structure Prediction (단백질 구조 예측을 위한 서열 연관 규칙 탐사)

  • Kim, Jeong-Ja;Lee, Do-Heon;Baek, Yun-Ju
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.553-560
    • /
    • 2001
  • Bioinformatics is a discipline to support biological experiment projects by storing, managing data arising from genome research. In can also lead the experimental design for genome function prediction and regulation. Among various approaches of the genome research, the proteomics have been drawing increasing attention since it deals with the final product of genomes, i.e., proteins, directly. This paper proposes a data mining technique to predict the structural characteristics of a given protein group, one of dominant factors of the functions of them. After explains associations among amino acid subsequences in the primary structures of proteins, which can provide important clues for determining secondary or tertiary structures of them, it defines a sequence association rule to represent the inter-subsequences. It also provides support and confidence measures, newly designed to evaluate the usefulness of sequence association rules, After is proposes a method to discover useful sequence association rules from a given protein group, it evaluates the performance of the proposed method with protein sequence data from the SWISS-PROT protein database.

  • PDF

Mining Association Rules in Multiple Databases using Links (복수 데이터베이스에서 링크를 이용한 연관 규칙 탐사)

  • Bae, Jin-Uk;Sin, Hyo-Seop;Lee, Seok-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.8
    • /
    • pp.939-954
    • /
    • 1999
  • 데이타마이닝 분야에서는 대용량의 트랜잭션 데이타베이스와 같은 하나의 데이타베이스로부터 연관 규칙을 찾는 연구가 많이 수행되어왔다. 그러나, 창고형 할인매장이나 백화점 같이 고객 카드를 이용하는 판매점의 등장으로, 단지 트랜잭션에 대한 분석 뿐만이 아니라, 트랜잭션과 고객과의 관계에 대한 분석 또한 요구되고 있다. 즉, 두 개의 데이타베이스로부터 연관 규칙을 찾는 연구가 필요하다. 이 논문에서는 두 데이타베이스 사이에 링크를 생성하여 연관 항목집합을 찾는 알고리즘을 제안한다. 실험 결과, 링크를 이용한 알고리즘은 고객 데이타베이스가 메모리에 거주가능한 크기라면 시간에 따른 분석에 유용함을 보여주었다.Abstract There have been a lot of researches of mining association rules from one database such as transaction database until now. But as the large discount store using customer card emerges, the analysis is not only required about transactions, but also about the relation between transactions and customer data. That is, it is required to search association rules from two databases. This paper proposes an efficient algorithm constructing links from one database to the other. Our experiments show the algorithm using link is useful for temporal analysis of memory-resident customer database.

The application for predictive similarity measures of binary data in association rule mining (이분형 예측 유사성 측도의 연관성 평가 기준 적용 방안)

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.3
    • /
    • pp.495-503
    • /
    • 2011
  • The most widely used data mining technique is to find association rules. Association rule mining is the method to quantify the relationship between each set of items in very huge database based on the association thresholds. There are some basic association thresholds to explore meaningful association rules ; support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The net confidence and the attributably pure confidence were developed to compensate for this drawback, but they have other drawbacks.In this paper we consider some predictive similarity measures for binary data in cluster analysis and multi-dimensional analysis as association threshold to compensate for these drawbacks. The comparative studies with net confidence, attributably pure confidence, and some predictive similarity measures are shown by numerical example.

Design and Implementation of Mining System for Audit Data Analysis (감사데이터 분석을 위한 마이닝 시스템 설계 및 구현)

  • 김은희;문호성;신문선;류근호;김기영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.4-6
    • /
    • 2002
  • 네트워크의 광역화와 새로운 공격 유형의 발생으로 침입 탐지 시스템에서 새로운 시퀀스의 추가나 침입탐지 모델 구축의 수동적인 접근부분이 문제가 되고 있다. 특히 기존의 침입탐지 시스템들은 대량의 네트워크 하부구조를 가진 네트워크 정보를 수집 및 분석하는데 있어 각각 전담 시스템들이 담당하고 있다. 따라서 침입탐지 시스템에서 증가하는 많은 양의 감사데이터를 분석하여 다양한 공격 유형들에 대해서 능동적으로 대처할 수 있도록 하는 것이 필요하다. 최근, 침입 탐지 시스템에 데이터 마이닝 기법을 적용하여 능동적인 침입탐지시스템을 구축하고자 하는 연구들이 활발히 이루어지고 있다. 이 논문에서는 대량의 감사 데이터를 정확하고 효율적으로 분석하기 위한 마이닝 시스템을 설계하고 구현한다. 감사데이터는 트랜잭션데이터베이스와는 다른 특성을 가지는 데이터이므로 이를 고려한 마이닝 시스템을 설계하였다. 구현된 마이닝 시스템은 연관규칙 기법을 이용하여 감사데이터 속성간의 연관성을 탐사하고, 빈발 에피소드 기법을 적용하여 주어진 시간 내에서 상호 연관성 있게 발생한 이벤트들을 모음으로써 연속적인 시간간격 내에서 빈번하게 발생하는 사건들의 발견과 알려진 사건에서 시퀀스의 행동을 예측하거나 기술할 수 있는 규칙을 생성한 수 있다. 감사데이터의 마이닝 결과 생성된 규칙들은 능동적인 보안정책을 구축하는데 활용필 수 있다. 또한 데이터양의 감소로 침입 탐지시간을 최소화하는데도 기여한 것이다.

  • PDF

Utilization of similarity measures by PIM with AMP as association rule thresholds (모든 주변 비율을 고려한 확률적 흥미도 측도 기반 유사성 측도의 연관성 평가 기준 활용 방안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.117-124
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relationship between a set of items in a huge database, andhas been applied in various fields like internet shopping mall, healthcare, insurance, and education. There are three primary interestingness measures for association rule, support and confidence and lift. Confidence is the most important measure of these measures, and we generate some association rules using confidence. But it is an asymmetric measure and has only positive value. So we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure (PIM) with all marginal proportions (AMP) to solve this problem. The comparative studies with support, confidences, lift, chi-square statistics, and some similarity measures by PIM with AMPare shown by numerical example. As the result, we knew that the similarity measures by PIM with AMP could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values, and select the best similarity measure by PIM with AMP.