• Title/Summary/Keyword: 빈발 패턴

Search Result 128, Processing Time 0.028 seconds

An Efficient Mining for Closed Frequent Sequences (효율적인 닫힌 빈발 시퀀스 마이닝)

  • Kim, Hyung-Geun;Whang, Whan-Kyu
    • Journal of Industrial Technology
    • /
    • v.25 no.A
    • /
    • pp.163-173
    • /
    • 2005
  • Recent sequential pattern mining algorithms mine all of the frequent sequences satisfying a minimum support threshold in a large database. However, when a frequent sequence becomes very long, such mining will generate an explosive number of frequent sequence, which is prohibitively expensive in time. In this paper, we proposed a novel sequential pattern algorithm using only closed frequent sequences which are small subset of very large frequent sequences. Our algorithm extends the sequence by depth-first search strategy with effective pruning. Using bitmap representation of underlying databases, we can obtain a closed frequent sequence considerably faster than the currently reported methods.

  • PDF

High Utility Pattern Mining using a Prefix-Tree (Prefix-Tree를 이용한 높은 유틸리티 패턴 마이닝 기법)

  • Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan;Lee, In-Gi;Yong, Hwan-Seong
    • Journal of KIISE:Databases
    • /
    • v.36 no.5
    • /
    • pp.341-351
    • /
    • 2009
  • Recently high utility pattern (HUP) mining is one of the most important research issuer in data mining since it can consider the different weight Haloes of items. However, existing mining algorithms suffer from the performance degradation because it cannot easily apply Apriori-principle for pattern mining. In this paper, we introduce new high utility pattern mining approach by using a prefix-tree as in FP-Growth algorithm. Our approach stores the weight value of each item into a node and utilizes them for pruning unnecessary patterns. We compare the performance characteristics of three different prefix-tree structures. By thorough experimentation, we also prove that our approach can give performance improvement to a degree.

Optimal Moving Pattern Extraction of the Moving Object for Efficient Resource Allocation (효율적 자원 배치를 위한 이동객체의 최적 이동패턴 추출)

  • Cho, Ho-Seong;Nam, Kwang-Woo;Jang, Min-Seok;Lee, Yon-Sik
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.689-692
    • /
    • 2021
  • This paper is a prior study to improve the efficiency of offloading based on mobile agents to optimize allocation of computing resources and reduce latency that support user proximity of application services in a Fog/Edge Computing (FEC) environment. We propose an algorithm that effectively reduces the execution time and the amount of memory required when extracting optimal moving patterns from the vast set of spatio-temporal movement history data of moving objects. The proposed algorithm can be useful for the distribution and deployment of computing resources for computation offloading in future FEC environments through frequency-based optimal path extraction.

  • PDF

Dynamic Link Recommendation Based on Anonymous Weblog Mining (익명 웹로그 탐사에 기반한 동적 링크 추천)

  • Yoon, Sun-Hee;Oh, Hae-Seok
    • The KIPS Transactions:PartC
    • /
    • v.10C no.5
    • /
    • pp.647-656
    • /
    • 2003
  • In Webspace, mining traversal patterns is to understand user's path traversal patterns. On this mining, it has a unique characteristic which objects (for example, URLs) may be visited due to their positions rather than contents, because users move to other objects according to providing information services. As a consequence, it becomes very complex to extract meaningful information from these data. Recently discovering traversal patterns has been an important problem in data mining because there has been an increasing amount of research activity on various aspects of improving the quality of information services. This paper presents a Dynamic Link Recommendation (DLR) algorithm that recommends link sets on a Web site through mining frequent traversal patterns. It can be employed to any Web site with massive amounts of data. Our experimentation with two real Weblog data clearly validate that our method outperforms traditional method.

Precision Analysis of the STOMP(FW) Algorithm According to the Spatial Conceptual Hierarchy (공간 개념 계층에 따른 STOMP(FW) 알고리즘의 정확도 분석)

  • Lee, Yon-Sik;Kim, Young-Ja;Park, Sung-Sook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.12
    • /
    • pp.5015-5022
    • /
    • 2010
  • Most of the existing pattern mining techniques are capable of searching patterns according to the continuous change of the spatial information of an object but there is no constraint on the spatial information that must be included in the extracted pattern. Thus, the existing techniques are not applicable to the optimal path search between specific nodes or path prediction considering the nodes that a moving object is required to round during a unit time. In this paper, the precision of the path search according to the spatial hierarchy is analyzed using the Spatial-Temporal Optimal Moving Pattern(with Frequency & Weight) (STOPM(FW)) algorithm which searches for the optimal moving path by considering the most frequent pattern and other weighted factors such as time and cost. The result of analysis shows that the database retrieval time is minimized through the reduction of retrieval range applying with the spatial constraints. Also, the optimal moving pattern is efficiently obtained by considering whether the moving pattern is included in each hierarchical spatial scope of the spatial hierarchy or not.

Temporal Pattern Mining of Moving Objects for Location based Services (위치 기반 서비스를 위한 이동 객체의 시간 패턴 탐사 기법)

  • Lee, Jun-Uk;Baek, Ok-Hyeon;Ryu, Geun-Ho
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.335-346
    • /
    • 2002
  • LBS(Location Based Services) provide the location-based information to its mobile users. The primary functionality of these services is to provide useful information to its users at a minimum cost of resources. The functionality can be implemented through data mining techniques. However, conventional data mining researches have not been considered spatial and temporal aspects of data simultaneously. Therefore, these techniques are inappropriate to apply on the objects of LBS, which change spatial attributes over time. In this paper, we propose a new data mining technique for identifying the temporal patterns from the series of the locations of moving objects that have both temporal and spatial dimension. We use a spatial operation of contains to generalize the location of moving point and apply time constraints between the locations of a moving object to make a valid moving sequence. Finally, the spatio-temporal technique proposed in this paper is very practical approach in not only providing more useful knowledge to LBS, but also improving the quality of the services.

A proper folder recommendation technique using frequent itemsets for efficient e-mail classification (효과적인 이메일 분류를 위한 빈발 항목집합 기반 최적 이메일 폴더 추천 기법)

  • Moon, Jong-Pil;Lee, Won-Suk;Chang, Joong-Hyuk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.2
    • /
    • pp.33-46
    • /
    • 2011
  • Since an e-mail has been an important mean of communication and information sharing, there have been much effort to classify e-mails efficiently by their contents. An e-mail has various forms in length and style, and words used in an e-mail are usually irregular. In addition, the criteria of an e-mail classification are subjective. As a result, it is quite difficult for the conventional text classification technique to be adapted to an e-mail classification efficiently. An e-mail classification technique in a commercial e-mail program uses a simple text filtering technique in an e-mail client. In the previous studies on automatic classification of an e-mail, the Naive Bayesian technique based on the probability has been used to improve the classification accuracy, and most of them are on an e-mail in English. This paper proposes the personalized recommendation technique of an email in Korean using a data mining technique of frequent patterns. The proposed technique consists of two phases such as the pre-processing of e-mails in an e-mail folder and the generating a profile for the e-mail folder. The generated profile is used for an e-mail to be classified into the most appropriate e-mail folder by the subjective criteria. The e-mail classification system is also implemented, which adapts the proposed technique.

An Efficient Method to Find Accurate Spot-matching Patterns in Protein 2-DE Image Analysis (단백질 2-DE 이미지 분석에서 정확한 스팟 매칭 패턴 검색을 위한 효과적인 방법)

  • Jin, Yan-Hua;Lee, Won-Suk
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.5
    • /
    • pp.551-555
    • /
    • 2010
  • In protein 2-DE image analysis, the accuracy of spot-matching operation which identifies the spot of the same protein in each 2-DE gel image is intensively influenced by the errors caused by the various experimental conditions. This paper proposes an efficient method to find more accurate spot-matching patterns based on multiple reference gel images in spot-matching pattern analysis in protein 2-DE image analysis. Additionally, in order to improve the reduce the execution time which is increased exponentially along with the increasing number of gel images, a "partition then extension" framework is used to find spot-matching pattern of long length and of higher accuracy. In the experiments on real 2-DE images of human liver tissue are used to confirm the accuracy and the efficiency of the proposed algorithm.

Anomaly Detection using Temporal Association Rules and Classification (시간연관규칙과 분류규칙을 이용한 비정상행위 탐지 기법)

  • Lee, Hohn-Gyu;Lee, Yang-Woo;Kim, Lyong;Seo, Sung-Bo;Ryu, Keun-Ho;Park, Jin-Soo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05c
    • /
    • pp.1579-1582
    • /
    • 2003
  • 점차 네트워크상의 침입 시도가 증가되고 다변화되어 침입탐지에 많은 어려움을 주고 있다. 시스템에 새로운 침입에 대한 탐지능력과 다량의 감사데이터의 효율적인 분석을 위해 데이터마이닝 기법이 적용된다. 침입탐지 방법 중 비정상행위 탐지는 모델링된 정상행위에서 벗어나는 행위들을 공격행위로 간주하는 기법이다. 비정상행위 탐지에서 정상행위 모델링을 하기 위해 연관규칙이나 빈발에피소드가 적용되었다. 그러나 이러한 기법들에서는 시간요소를 배제하거나 패턴들의 발생순서만을 다루기 때문에 정확하고 유용한 정보를 제공할 수 없다. 따라서 이 논문에서는 이 문제를 해결할 수 있는 시간연관규칙과 분류규칙을 이용한 비정상행위 탐지 모델을 제안하였다. 즉, 발생되는 패턴의 주기성과 달력표현을 이용, 유용한 시간지식표현을 갖는 시간연관규칙을 이용해 정상행위 프로파일을 생성하였고 이 프로파일에 의해 비정상행위로 간주되는 규칙들을 발견하고 보다 정확한 비정상행위 판별 여부를 결정하기 위해서 분류기법을 적용하였다.

  • PDF

Mining Frequent Trajectory Patterns in RFID Data Streams (RFID 데이터 스트림에서 이동궤적 패턴의 탐사)

  • Seo, Sung-Bo;Lee, Yong-Mi;Lee, Jun-Wook;Nam, Kwang-Woo;Ryu, Keun-Ho;Park, Jin-Soo
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.127-136
    • /
    • 2009
  • This paper proposes an on-line mining algorithm of moving trajectory patterns in RFID data streams considering changing characteristics over time and constraints of single-pass data scan. Since RFID, sensor, and mobile network technology have been rapidly developed, many researchers have been recently focused on the study of real-time data gathering from real-world and mining the useful patterns from them. Previous researches for sequential patterns or moving trajectory patterns based on stream data have an extremely time-consum ing problem because of multi-pass database scan and tree traversal, and they also did not consider the time-changing characteristics of stream data. The proposed method preserves the sequential strength of 2-lengths frequent patterns in binary relationship table using the time-evolving graph to exactly reflect changes of RFID data stream from time to time. In addition, in order to solve the problem of the repetitive data scans, the proposed algorithm infers candidate k-lengths moving trajectory patterns beforehand at a time point t, and then extracts the patterns after screening the candidate patterns by only one-pass at a time point t+1. Through the experiment, the proposed method shows the superior performance in respect of time and space complexity than the Apriori-like method according as the reduction ratio of candidate sets is about 7 percent.

  • PDF