• Title/Summary/Keyword: 빈발 패턴 네트워크

Search Result 23, Processing Time 0.042 seconds

빈발 패턴 네트워크에서 연관 규칙 발견을 위한 아이템 클러스터링

  • O, Gyeong-Jin;Jeong, Jin-Guk;Jo, Geun-Sik
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2007.05a
    • /
    • pp.321-328
    • /
    • 2007
  • 데이터마이닝은 대용량의 데이터에 숨겨진 의미있고 유용한 패턴과 상관관계를 추출하여 의사결정에 활용하는 작업이다. 그 중에서도 고객 트랜잭션의 데이터베이스에서 아이템 사이에 존재하는 연관규칙을 찾는 것은 중요한 일이 되었다. Apriori 알고리즘 이후 연관규칙을 찾기 위해 대용량 데이터베이스로부터 압축된 의미있는 정보를 저장하기 위한 데이터 구조와 알고리즘들이 제안되어 왔다. 본 논문에서는 정점으로 아이템을 표현하고, 간선으로 두 아이템집합을 표현하는 빈발 패턴 네트워크(FPN)이라 불리는 새 자료 구조를 제안한다. 빈발 패턴 네트워크에서 아이템 사이의 연관 관계를 발견하기 위해 이 구조를 어떻게 효율적으로 사용 하느냐에 초점을 두고 있다. 구조의 효율적인 사용을 위하여 한 아이템이 클러스터 내의 아이템과는 유사도가 높고, 다른 클러스터의 아이템과는 유사도가 낮도록 네트워크의 정점을 클러스터링하는 방법을 사용한다. 실험은 신뢰도, 상관관계 그리고 간선 가중치 유사도를 이용하여 네트워크에서 아이템 클러스터링의 정확도를 보여준다. 본 논문의 실험 결과를 통해 신뢰도 유사도가 네트워크의 정점을 클러스터링할 때 클러스터의 정확성에 가장 많은 영향을 미친다는 것을 알 수 있었다.

  • PDF

Constructing Gene Regulatory Networks using Frequent Gene Expression Pattern and Chain Rules (빈발 유전자 발현 패턴과 연쇄 규칙을 이용한 유전자 조절 네트워크 구축)

  • Lee, Heon-Gyu;Ryu, Keun-Ho;Joung, Doo-Young
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.9-20
    • /
    • 2007
  • Groups of genes control the functioning of a cell by complex interactions. Such interactions of gene groups are tailed Gene Regulatory Networks(GRNs). Two previous data mining approaches, clustering and classification, have been used to analyze gene expression data. Though these mining tools are useful for determining membership of genes by homology, they don't identify the regulatory relationships among genes found in the same class of molecular actions. Furthermore, we need to understand the mechanism of how genes relate and how they regulate one another. In order to detect regulatory relationships among genes from time-series Microarray data, we propose a novel approach using frequent pattern mining and chain rules. In this approach, we propose a method for transforming gene expression data to make suitable for frequent pattern mining, and gene expression patterns we detected by applying the FP-growth algorithm. Next, we construct a gene regulatory network from frequent gene patterns using chain rules. Finally, we validate our proposed method through our experimental results, which are consistent with published results.

시퀀스 패턴 마이닝 기법을 적용한 침입탐지 시스템의 경보데이터 패턴분석

  • Shin, Moon-Sun
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05a
    • /
    • pp.451-454
    • /
    • 2010
  • 침입탐지란 컴퓨터와 네트워크 자원에 대한 유해한 침입 행동을 식별하고 대응하는 과정이다. 점차적으로 시스템에 대한 침입의 유형들이 복잡해지고 전문적으로 이루어지면서 빠르고 정확한 대응을 할 수 있는 시스템이 요구되고 있다. 이에 대용량의 데이터를 분석하여 의미 있는 정보를 추출하는 데이터 마이닝 기법을 적용하여 지능적이고 자동화된 탐지 및 경보데이터 패턴 분석에 이용할 수 있다. 본 논문에서는 경보데이터 패턴 분석을 위해 시퀀스패턴기법을 적용한 경보데이터 마이닝 엔진을 구축한다. 구현된 경보데이터 마이닝 시스템은 기존의 시퀀스 패턴 알고리즘인 PrefixSpan 알고리즘을 확장 구현하여 경보데이터의 빈발 경보시퀀스 분석과 빈발 공격시퀀스 분석에 활용할 수 있다.

  • PDF

Discovering Association Rules using Item Clustering on Frequent Pattern Network (빈발 패턴 네트워크에서 아이템 클러스터링을 통한 연관규칙 발견)

  • Oh, Kyeong-Jin;Jung, Jin-Guk;Ha, In-Ay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.1
    • /
    • pp.1-17
    • /
    • 2008
  • Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.

  • PDF

Incremental Frequent Pattern Detection Scheme Based on Sliding Windows in Graph Streams (그래프 스트림에서 슬라이딩 윈도우 기반의 점진적 빈발 패턴 검출 기법)

  • Jeong, Jaeyun;Seo, Indeok;Song, Heesub;Park, Jaeyeol;Kim, Minyeong;Choi, Dojin;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.2
    • /
    • pp.147-157
    • /
    • 2018
  • Recently, with the advancement of network technologies, and the activation of IoT and social network services, many graph stream data have been generated. As the relationship between objects in the graph streams changes dynamically, studies have been conducting to detect or analyze the change of the graph. In this paper, we propose a scheme to incrementally detect frequent patterns by using frequent patterns information detected in previous sliding windows. The proposed scheme calculates values that represent whether the frequent patterns detected in previous sliding windows will be frequent in how many future silding windows. By using the values, the proposed scheme reduces the overall amount of computation by performing only necessary calculations in the next sliding window. In addition, only the patterns that are connected between the patterns are recognized as one pattern, so that only the more significant patterns are detected. We conduct various performance evaluations in order to show the superiority of the proposed scheme. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.

User Clustering based on Genre Pattern for Efficient Collaborative Filtering System (효율적인 협업적 여과 시스템을 위한 장르 패턴 기반의 사용자 클러스터링)

  • Choi, Ja-Hyun;Ha, In-Ay;Hong, Myung-Duk;Jo, Geun-Sik
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2011.06a
    • /
    • pp.171-172
    • /
    • 2011
  • 협업적 여과 시스템은 사용자에 대한 클러스터링을 구축한 후, 구축된 클러스터를 기반으로 사용자에게 영화를 추천한다. 하지만 사용자 클러스터링 구축에 많은 시간이 소요되고, 사용자가 평가한 영화가 피드백이 되었을 경우 재구축이 쉽지 않다. 본 논문에서는 사용자 클러스터링의 재구축을 용이하게 하기 위해 빈발패턴 네트워크를 이용하여 클러스터링을 구축하고, 이를 협업적 여과 시스템에 적용하여 영화를 추천한다. 구축된 클러스터를 통해 사용자 클러스터를 재구축시 소요되는 시간 비용을 줄이면서, 전통적인 협업적 여과 시스템과 유사한 성능의 추천이 가능하게 되었다.

  • PDF

Clustering Algorithm using the DFP-Tree based on the MapReduce (맵리듀스 기반 DFP-Tree를 이용한 클러스터링 알고리즘)

  • Seo, Young-Won;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.23-30
    • /
    • 2015
  • As BigData is issued, many applications that operate based on the results of data analysis have been developed, typically applications are products recommend service of e-commerce application service system, search service on the search engine service and friend list recommend system of social network service. In this paper, we suggests a decision frequent pattern tree that is combined the origin frequent pattern tree that is mining similar pattern to appear in the data set of the existing data mining techniques and decision tree based on the theory of computer science. The decision frequent pattern tree algorithm improves about problem of frequent pattern tree that have to make some a lot's pattern so it is to hard to analyze about data. We also proposes to model for a Mapredue framework that is a programming model to help to operate in distributed environment.

In-memory Compression Scheme Based on Incremental Frequent Patterns for Graph Streams (그래프 스트림 처리를 위한 점진적 빈발 패턴 기반 인-메모리 압축 기법)

  • Lee, Hyeon-Byeong;Shin, Bo-Kyoung;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.35-46
    • /
    • 2022
  • Recently, with the development of network technologies, as IoT and social network service applications have been actively used, a lot of graph stream data is being generated. In this paper, we propose a graph compression scheme that considers the stream graph environment by applying graph mining to the existing compression technique, which has been focused on compression rate and runtime. In this paper, we proposed Incremental frequent pattern based compression technique for graph streams. Since the proposed scheme keeps only the latest reference patterns, it increases the storage utilization and improves the query processing time. In order to show the superiority of the proposed scheme, various performance evaluations are performed in terms of compression rate and processing time compared to the existing method. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.

An Efficient Method for Mining Frequent Patterns based on Weighted Support over Data Streams (데이터 스트림에서 가중치 지지도 기반 빈발 패턴 추출 방법)

  • Kim, Young-Hee;Kim, Won-Young;Kim, Ung-Mo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.8
    • /
    • pp.1998-2004
    • /
    • 2009
  • Recently, due to technical developments of various storage devices and networks, the amount of data increases rapidly. The large volume of data streams poses unique space and time constraints on the data mining process. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. Most of the researches based on the support are concerned with the frequent itemsets, but ignore the infrequent itemsets even if it is crucial. In this paper, we propose an efficient method WSFI-Mine(Weighted Support Frequent Itemsets Mine) to mine all frequent itemsets by one scan from the data stream. This method can discover the closed frequent itemsets using DCT(Data Stream Closed Pattern Tree). We compare the performance of our algorithm with DSM-FI and THUI-Mine, under different minimum supports. As results show that WSFI-Mine not only run significant faster, but also consume less memory.

GGenre Pattern based User Clustering for Performance Improvement of Collaborative Filtering System (협업적 여과 시스템의 성능 향상을 위한 장르 패턴 기반 사용자 클러스터링)

  • Choi, Ja-Hyun;Ha, In-Ay;Hong, Myung-Duk;Jo, Geun-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.11
    • /
    • pp.17-24
    • /
    • 2011
  • Collaborative filtering system is the clustering about user is built and then based on that clustering results will recommend the preferred item to the user. However, building user clustering is time consuming and also once the users evaluate and give feedback about the film then rebuilding the system is not simple. In this paper, genre pattern of movie recommendation systems is being used and in order to simplify and reduce time of rebuilding user clustering. A Frequent pattern networks is used and then extracts user preference genre patterns and through that extracted patterns user clustering will be built. Through built the clustering for all neighboring users to collaborative filtering is applied and then recommends movies to the user. When receiving user information feedback, traditional collaborative filtering is to rebuild the clustering for all neighbouring users to research and do the clustering. However by using frequent pattern Networks, through user clustering based on genre pattern, collaborative filtering is applied and when rebuilding user clustering inquiry limited by search time can be reduced. After receiving user information feedback through proposed user clustering based on genre pattern, the time that need to spent on re-establishing user clustering can be reduced and also enable the possibility of traditional collaborative filtering systems and recommendation of a similar performance.