• Title/Summary/Keyword: 빈발 패턴

Search Result 128, Processing Time 0.026 seconds

Clustering Algorithm using the DFP-Tree based on the MapReduce (맵리듀스 기반 DFP-Tree를 이용한 클러스터링 알고리즘)

  • Seo, Young-Won;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.23-30
    • /
    • 2015
  • As BigData is issued, many applications that operate based on the results of data analysis have been developed, typically applications are products recommend service of e-commerce application service system, search service on the search engine service and friend list recommend system of social network service. In this paper, we suggests a decision frequent pattern tree that is combined the origin frequent pattern tree that is mining similar pattern to appear in the data set of the existing data mining techniques and decision tree based on the theory of computer science. The decision frequent pattern tree algorithm improves about problem of frequent pattern tree that have to make some a lot's pattern so it is to hard to analyze about data. We also proposes to model for a Mapredue framework that is a programming model to help to operate in distributed environment.

Efficient Mining of Dynamic Weighted Sequential Patterns (동적 가중치를 이용한 효율적인 순차 패턴 탐사 기법)

  • Choi, Pilsun;Kang, Donghyun;Kim, Hwan;Kim, Daein;Hwang, Buhyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.1365-1368
    • /
    • 2012
  • 순차 패턴 탐사 기법은 순서를 갖는 패턴들의 집합 중에 빈발하게 발생하는 패턴을 찾아내는 기법이다. 순차 패턴 탐사 분야 중에 동적 가중치 순차 패턴 탐사는 가중치가 시간에 따라 변화하는 컴퓨팅 환경에 적용하는 마이닝 기법으로 동적인 중요도 변화를 마이닝에 적용하여 다양한 환경에서 활용 가능하다. 이 논문에서는 다양한 순차 데이터에서 동적 가중치를 적용하여 순차 패턴을 탐사하는 새로운 시퀀스 데이터 마이닝 기법에 대하여 제안한다. 제안하는 기법은 시간 순서에 의한 상대적인 동적 가중치를 사용하여 탐색해야 하는 후보 패턴을 줄여줄 수 있어 빈발한 시퀀스 패턴을 빠르게 찾을 수 있다. 이 기법을 사용하면 기존 가중치를 적용하는 방식보다 메모리 사용과 처리 시간을 줄여줘 매우 효율적이다.

Discovery of Frequent Sequence Pattern in Moving Object Databases (이동 객체 데이터베이스에서 빈발 시퀀스 패턴 탐색)

  • Vu, Thi Hong Nhan;Lee, Bum-Ju;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.179-186
    • /
    • 2008
  • The converge of location-aware devices, GIS functionalities and the increasing accuracy and availability of positioning technologies pave the way to a range of new types of location-based services. The field of spatiotemporal data mining where relationships are defined by spatial and temporal aspect of data is encountering big challenges since the increased search space of knowledge. Therefore, we aim to propose algorithms for mining spatiotemporal patterns in mobile environment in this paper. Moving patterns are generated utilizing two algorithms called All_MOP and Max_MOP. The first one mines all frequent patterns and the other discovers only maximal frequent patterns. Our proposed approach is able to reduce consuming time through comparison with DFS_MINE algorithm. In addition, our approach is applicable to location-based services such as tourist service, traffic service, and so on.

Frequent Pattern Mining By using a Completeness for BigData (빅데이터에 대한 Completeness를 이용한 빈발 패턴 마이닝)

  • Park, In-Kyu
    • Journal of Korea Game Society
    • /
    • v.18 no.2
    • /
    • pp.121-130
    • /
    • 2018
  • Most of those studies use frequency, the number of times a pattern appears in a transaction database, as the key measure for pattern interestingness. It prerequisites that any interesting pattern should occupy a maximum portion of the transactions it appears. But in our real world scenarios the completeness of any pattern is more likely to become various in transactions. Hence, we should also consider the problem of finding the qualified patterns with the significant values of the weighted support by completeness in order to reduce the loss of information within any pattern in transaction. In these pattern recommendation applications, patterns with higher completeness may lead to higher recall while patterns with higher completeness may lead to higher recall while patterns with higher frequency lead to higher precision. In this paper, we propose a measure of weighted support and completeness and an algorithm WSCFPM(weigted support and completeness frequent pattern mining). Our algorithm handles the invalidation of the monotone or anti-monotone property which does not hold on completeness. Extensive performance analysis show that our algorithm is very efficient and scalable for word pattern mining.

A Sequential Pattern Mining based on Dynamic Weight in Data Stream (스트림 데이터에서 동적 가중치를 이용한 순차 패턴 탐사 기법)

  • Choi, Pilsun;Kim, Hwan;Kim, Daein;Hwang, Buhyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.137-144
    • /
    • 2013
  • A sequential pattern mining is finding out frequent patterns from the data set in time order. In this field, a dynamic weighted sequential pattern mining is applied to a computing environment that changes depending on the time and it can be utilized in a variety of environments applying changes of dynamic weight. In this paper, we propose a new sequence data mining method to explore the stream data by applying the dynamic weight. This method reduces the candidate patterns that must be navigated by using the dynamic weight according to the relative time sequence, and it can find out frequent sequence patterns quickly as the data input and output using a hash structure. Using this method reduces the memory usage and processing time more than applying the existing methods. We show the importance of dynamic weighted mining through the comparison of different weighting sequential pattern mining techniques.

A Comparison of Performance between STMP/MST and Existing Spatio-Temporal Moving Pattern Mining Methods (STMP/MST와 기존의 시공간 이동 패턴 탐사 기법들과의 성능 비교)

  • Lee, Yon-Sik;Kim, Eun-A
    • Journal of Internet Computing and Services
    • /
    • v.10 no.5
    • /
    • pp.49-63
    • /
    • 2009
  • The performance of spatio-temporal moving pattern mining depends on how to analyze and process the huge set of spatio-temporal data due to the nature of it. The several method was presented in order to solve the problems in which existing spatio-temporal moving pattern mining methods[1-10] have, such as increasing execution time and required memory size during the pattern mining, but they did not solve properly yet. Thus, we proposed the STMP/MST method[11] as a preceding research in order to extract effectively sequential and/or periodical frequent occurrence moving patterns from the huge set of spatio-temporal moving data. The proposed method reduces patterns mining execution time, using the moving sequence tree based on hash tree. And also, to minimize the required memory space, it generalizes detailed historical data including spatio-temporal attributes into the real world scopes of space and time by using spatio-temporal concept hierarchy. In this paper, in order to verify the effectiveness of the STMP/MST method, we compared and analyzed performance with existing spatio-temporal moving pattern mining methods based on the quantity of mining data and minimum support factor.

  • PDF

Efficient Dynamic Weighted Frequent Pattern Mining by using a Prefix-Tree (Prefix-트리를 이용한 동적 가중치 빈발 패턴 탐색 기법)

  • Jeong, Byeong-Soo;Farhan, Ahmed
    • The KIPS Transactions:PartD
    • /
    • v.17D no.4
    • /
    • pp.253-258
    • /
    • 2010
  • Traditional frequent pattern mining considers equal profit/weight value of every item. Weighted Frequent Pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery by considering different weights for different items. Existing algorithms in this area are based on fixed weight. But in our real world scenarios the price/weight/importance of a pattern may vary frequently due to some unavoidable situations. Tracking these dynamic changes is very necessary in different application area such as retail market basket data analysis and web click stream management. In this paper, we propose a novel concept of dynamic weight and an algorithm DWFPM (dynamic weighted frequent pattern mining). Our algorithm can handle the situation where price/weight of a pattern may vary dynamically. It scans the database exactly once and also eligible for real time data processing. To our knowledge, this is the first research work to mine weighted frequent patterns using dynamic weights. Extensive performance analyses show that our algorithm is very efficient and scalable for WFP mining using dynamic weights.

Extracting Common Structure of Semistructured data Using mining frequent patterns (빈발 패턴 탐사 기법을 이용한 반구조적 데이터로부터의 공통구조 추출)

  • 이영언;문봉희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10a
    • /
    • pp.302-304
    • /
    • 2000
  • 인터넷의 발달로 웹에는 엄청난 데이터가 존재하나, 불규칙적인 구조를 이루고 있는 반구조적 데이터가 대부분이다. 이러한 반구조적 데이터는 데이터들간의 어떤 정확하게 정해진 구조를 갖고 있진 않지만 불완전하고 불규칙한 구조 정보를 포함하고 있는 것으로, 데이터들 간의 관계를 규명할 수 있는 공통 구조 정보를 추출하여 효과적으로 구조화시킴으로써 정보로서의 가치를 높일 필요성이 대두되게 되었다. 또, 데이터 처리 과정에서 기존의 잘 정의된 구조를 가진 데이터베이스의 장점을 수용하기 위해서는 반구조적 데이터 집합의 불완전한 구조 정보로부터 공통 구조를 추출하는 것이 요구된다. 본 연구에서는 후보 항목 집합의 생성이 없는 빈발 패턴 탐사 기법을 사용하여 반구조적 데이터 집합으로부터 공통구조를 추출하고자 한다.

  • PDF

Evaluation Of Improved Usage Profiles Using Frequency Support Threshold In Clusters (클러스터 내부 빈발 지지도를 이용한 개선된 사용 프로파일 평가)

  • 안계순;이필규
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10d
    • /
    • pp.277-279
    • /
    • 2002
  • 웹 로그 기반의 웹 사용 마이닝은 명시적 평가 의존, 확장성 결여, 그리고 다차원 및 희박한 데이터에 성능이 떨어지는 협력적 여과의 문제를 다소 해결할 수 있다. 그러나 k-Means 군집화 방법으로 생성된 군집속 유사 사용자 이동 패턴으로는 클러스터속 사용자 전체의 선호도를 표현할 수 없으므로 사용자 이동 패턴인 트랜잭션들로부터 사용 프로파일을 유도해야 한다. 본 논문에서는 유사 군집 사용자들의 관심과 기호를 표현할 수 있도록 클러스터 내부 데이타로부터 평균 가중치 및 빈발 지지도 임계값을 사용하여 개선된 사용 프로파일을 생성하고 실험 데이터를 통한 예측력과 추천에 대한 성능을 평가한다.

  • PDF

Discovery of Frequent Traversal Patterns from Weighted Traversals and Performance Enhancement by Traversal Split (가중치 순회로부터 빈발 순회패턴의 탐사 및 순회분할을 통한 성능향상)

  • Lee, Seong-Dae;Park, Hyu-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.5
    • /
    • pp.940-948
    • /
    • 2007
  • Many real world problems can be modeled as a graph and traversals on the graph. The structure of Web pages can be represented as a graph, for example, and user's navigation paths on the Web pages can be model as a traversal on the graph. It is interesting to discover valuable patterns, such as frequent patterns, from such traversals. In this paper, we propose an algorithm to discover frequent traversal patterns when a directed graph and weighted traversals on the graph are given. Furthermore, we propose a performance enhancement by traversal split and then verify it through experiments.