• Title/Summary/Keyword: 빈발 패턴

Search Result 128, Processing Time 0.026 seconds

A Weighted Frequent Graph Pattern Mining Approach considering Length-Decreasing Support Constraints (길이에 따라 감소하는 빈도수 제한조건을 고려한 가중화 그래프 패턴 마이닝 기법)

  • Yun, Unil;Lee, Gangin
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.125-132
    • /
    • 2014
  • Since frequent pattern mining was proposed in order to search for hidden, useful pattern information from large-scale databases, various types of mining approaches and applications have been researched. Especially, frequent graph pattern mining was suggested to effectively deal with recent data that have been complicated continually, and a variety of efficient graph mining algorithms have been studied. Graph patterns obtained from graph databases have their own importance and characteristics different from one another according to the elements composing them and their lengths. However, traditional frequent graph pattern mining approaches have the limitations that do not consider such problems. That is, the existing methods consider only one minimum support threshold regardless of the lengths of graph patterns extracted from their mining operations and do not use any of the patterns' weight factors; therefore, a large number of actually useless graph patterns may be generated. Small graph patterns with a few vertices and edges tend to be interesting when their weighted supports are relatively high, while large ones with many elements can be useful even if their weighted supports are relatively low. For this reason, we propose a weight-based frequent graph pattern mining algorithm considering length-decreasing support constraints. Comprehensive experimental results provided in this paper show that the proposed method guarantees more outstanding performance compared to a state-of-the-art graph mining algorithm in terms of pattern generation, runtime, and memory usage.

Frequent Patten Tree based XML Stream Mining (빈발 패턴 트리 기반 XML 스트림 마이닝)

  • Hwang, Jeong-Hee
    • The KIPS Transactions:PartD
    • /
    • v.16D no.5
    • /
    • pp.673-682
    • /
    • 2009
  • XML data are widely used for data representation and exchange on the Web and the data type is an continuous stream in ubiquitous environment. Therefore there are some mining researches related to the extracting of frequent structures and the efficient query processing of XML stream data. In this paper, we propose a mining method to extract frequent structures of XML stream data in recent window based on the sliding window. XML stream data are modeled as a tree set, called XFP_tree and we quickly extract the frequent structures over recent XML data in the XFP_tree.

Incremental Frequent Pattern Detection Scheme Based on Sliding Windows in Graph Streams (그래프 스트림에서 슬라이딩 윈도우 기반의 점진적 빈발 패턴 검출 기법)

  • Jeong, Jaeyun;Seo, Indeok;Song, Heesub;Park, Jaeyeol;Kim, Minyeong;Choi, Dojin;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.2
    • /
    • pp.147-157
    • /
    • 2018
  • Recently, with the advancement of network technologies, and the activation of IoT and social network services, many graph stream data have been generated. As the relationship between objects in the graph streams changes dynamically, studies have been conducting to detect or analyze the change of the graph. In this paper, we propose a scheme to incrementally detect frequent patterns by using frequent patterns information detected in previous sliding windows. The proposed scheme calculates values that represent whether the frequent patterns detected in previous sliding windows will be frequent in how many future silding windows. By using the values, the proposed scheme reduces the overall amount of computation by performing only necessary calculations in the next sliding window. In addition, only the patterns that are connected between the patterns are recognized as one pattern, so that only the more significant patterns are detected. We conduct various performance evaluations in order to show the superiority of the proposed scheme. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.

Constructing Gene Regulatory Networks using Frequent Gene Expression Pattern and Chain Rules (빈발 유전자 발현 패턴과 연쇄 규칙을 이용한 유전자 조절 네트워크 구축)

  • Lee, Heon-Gyu;Ryu, Keun-Ho;Joung, Doo-Young
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.9-20
    • /
    • 2007
  • Groups of genes control the functioning of a cell by complex interactions. Such interactions of gene groups are tailed Gene Regulatory Networks(GRNs). Two previous data mining approaches, clustering and classification, have been used to analyze gene expression data. Though these mining tools are useful for determining membership of genes by homology, they don't identify the regulatory relationships among genes found in the same class of molecular actions. Furthermore, we need to understand the mechanism of how genes relate and how they regulate one another. In order to detect regulatory relationships among genes from time-series Microarray data, we propose a novel approach using frequent pattern mining and chain rules. In this approach, we propose a method for transforming gene expression data to make suitable for frequent pattern mining, and gene expression patterns we detected by applying the FP-growth algorithm. Next, we construct a gene regulatory network from frequent gene patterns using chain rules. Finally, we validate our proposed method through our experimental results, which are consistent with published results.

Border-based HSFI Algorithm for Hiding Sensitive Frequent Itemsets (민감한 빈발항목집합을 숨기기 위한 경계기반 HSFI 알고리즘)

  • Lee, Dan-Young;An, Hyoung-Keun;Koh, Jae-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.10
    • /
    • pp.1323-1334
    • /
    • 2011
  • This paper suggests the border based HSFI algorithm to hide sensitive frequent itemsets. Node formation of FP-Tree which is different from the previous one uses the border to minimize the impacts of nonsensitive frequent itemsets in hiding process, including the organization of sensitive and border information, and all transaction as well. As a result of applying HSFI algorithms, it is possible to be the example transaction database, by significantly reducing the lost items, it turns out that HSFI algorithm is more effective than the existing algorithm for maintaining the quality of more improved database.

Feature selection and frequent pattern analysis in protein motif sequence (모티프 서열에서의 특징추출 및 빈발패턴 분석)

  • Kim, Dae-Sung;Lee, Bum-Ju;Ryu, Keun-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.05a
    • /
    • pp.10-13
    • /
    • 2007
  • 모티프는 진화과정을 거치면서 단백질 서열상에서 부분적으로 높게 보존된 지역을 의미한다. 이러한 모티프는 단백질의 기능과 구조를 예측하거나 생물학적으로 관련성이 있는 단백질의 공통적인 특성을 기술하는데 사용된다. 또한, 모티프와 단백질 서열의 상관관계는 생물학적 기능 예측에 필수적이며, 이러한 예측 문제는 모티프 검색을 통해 서열에 존재하는 빈발한 서열패턴과 구조패턴을 통해 단백질 서열에 대한 분석이 가능하다. 이 논문에서는 단백질 서열에 존재하는 2차 구조 특성과 빈발패턴을 검색하고 추출된 정보를 이용하여 단백질 기능 분류에 활용하고자 한다.

  • PDF

Spatial-Temporal Moving Sequence Pattern Mining (시공간 이동 시퀀스 패턴 마이닝 기법)

  • Han, Seon-Young;Yong, Hwan-Seung
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.599-617
    • /
    • 2006
  • Recently many LBS(Location Based Service) systems are issued in mobile computing systems. Spatial-Temporal Moving Sequence Pattern Mining is a new mining method that mines user moving patterns from user moving path histories in a sensor network environment. The frequent pattern mining is related to the items which customers buy. But on the other hand, our mining method concerns users' moving sequence paths. In this paper, we consider the sequence of moving paths so we handle the repetition of moving paths. Also, we consider the duration that user spends on the location. We proposed new Apriori_msp based on the Apriori algorithm and evaluated its performance results.

Optimal Moving Pattern Mining using Frequency of Sequence and Weights (시퀀스 빈발도와 가중치를 이용한 최적 이동 패턴 탐사)

  • Lee, Yon-Sik;Park, Sung-Sook
    • Journal of Internet Computing and Services
    • /
    • v.10 no.5
    • /
    • pp.79-93
    • /
    • 2009
  • For developing the location based service which is individualized and specialized according to the characteristic of the users, the spatio-temporal pattern mining for extracting the meaningful and useful patterns among the various patterns of the mobile object on the spatio-temporal area is needed. Thus, in this paper, as the practical application toward the development of the location based service in which it is able to apply to the real life through the pattern mining from the huge historical data of mobile object, we are proposed STOMP(using Frequency of sequence and Weight) that is the new mining method for extracting the patterns with spatial and temporal constraint based on the problems of mining the optimal moving pattern which are defined in STOMP(F)[25]. Proposed method is the pattern mining method compositively using weighted value(weights) (a distance, the time, a cost, and etc) for our previous research(STOMP(F)[25]) that it uses only the pattern frequent occurrence. As to, it is the method determining the moving pattern in which the pattern frequent occurrence is above special threshold and the weight is most a little bit required among moving patterns of the object as the optimal path. And also, it can search the optimal path more accurate and faster than existing methods($A^*$, Dijkstra algorithm) or with only using pattern frequent occurrence due to less accesses to nodes by using the heuristic moving history.

  • PDF

Analysis and Performance Evaluation of Pattern Condensing Techniques used in Representative Pattern Mining (대표 패턴 마이닝에 활용되는 패턴 압축 기법들에 대한 분석 및 성능 평가)

  • Lee, Gang-In;Yun, Un-Il
    • Journal of Internet Computing and Services
    • /
    • v.16 no.2
    • /
    • pp.77-83
    • /
    • 2015
  • Frequent pattern mining, which is one of the major areas actively studied in data mining, is a method for extracting useful pattern information hidden from large data sets or databases. Moreover, frequent pattern mining approaches have been actively employed in a variety of application fields because the results obtained from them can allow us to analyze various, important characteristics within databases more easily and automatically. However, traditional frequent pattern mining methods, which simply extract all of the possible frequent patterns such that each of their support values is not smaller than a user-given minimum support threshold, have the following problems. First, traditional approaches have to generate a numerous number of patterns according to the features of a given database and the degree of threshold settings, and the number can also increase in geometrical progression. In addition, such works also cause waste of runtime and memory resources. Furthermore, the pattern results excessively generated from the methods also lead to troubles of pattern analysis for the mining results. In order to solve such issues of previous traditional frequent pattern mining approaches, the concept of representative pattern mining and its various related works have been proposed. In contrast to the traditional ones that find all the possible frequent patterns from databases, representative pattern mining approaches selectively extract a smaller number of patterns that represent general frequent patterns. In this paper, we describe details and characteristics of pattern condensing techniques that consider the maximality or closure property of generated frequent patterns, and conduct comparison and analysis for the techniques. Given a frequent pattern, satisfying the maximality for the pattern signifies that all of the possible super sets of the pattern must have smaller support values than a user-specific minimum support threshold; meanwhile, satisfying the closure property for the pattern means that there is no superset of which the support is equal to that of the pattern with respect to all the possible super sets. By mining maximal frequent patterns or closed frequent ones, we can achieve effective pattern compression and also perform mining operations with much smaller time and space resources. In addition, compressed patterns can be converted into the original frequent pattern forms again if necessary; especially, the closed frequent pattern notation has the ability to convert representative patterns into the original ones again without any information loss. That is, we can obtain a complete set of original frequent patterns from closed frequent ones. Although the maximal frequent pattern notation does not guarantee a complete recovery rate in the process of pattern conversion, it has an advantage that can extract a smaller number of representative patterns more quickly compared to the closed frequent pattern notation. In this paper, we show the performance results and characteristics of the aforementioned techniques in terms of pattern generation, runtime, and memory usage by conducting performance evaluation with respect to various real data sets collected from the real world. For more exact comparison, we also employ the algorithms implementing these techniques on the same platform and Implementation level.

Designing OLAP Cube Structures for Market Basket Analysis (장바구니 분석용 OLAP 큐브 구조의 설계)

  • Yu, Han-Ju;Choi, In-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.4
    • /
    • pp.179-189
    • /
    • 2007
  • Every purchase a customer makes builds patterns about how products are purchased together. The process of finding these patterns, called market basket analysis, is composed of two steps in the Microsoft Association Algorithm. The first step is to find frequent item-sets. The second step which requires much less time than the first step does is to generate association rules based on frequent item-sets. Even though the first step, finding frequent item-sets, is the core part of market basket analysis, when applied to Online Analytical Processing(OLAP) cubes it always raises several points such as longitudinal analysis becomes impossible and many unpractical transactions are built up. In this paper, a new OLAP cube structures designing method which makes longitudinal analysis be possible and also makes only real customers' purchase patterns be identified is proposed for market basket analysis.

  • PDF