• Title/Summary/Keyword: Frequent Pattern Mining

Search Result 103, Processing Time 0.025 seconds

Adapted Sequential Pattern Mining Algorithms for Business Service Identification (비즈니스 서비스 식별을 위한 변형 순차패턴 마이닝 알고리즘)

  • Lee, Jung-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.87-99
    • /
    • 2009
  • The top-down method for SOA delivery is recommended as a best way to take advantage of SOA. The core step of SOA delivery is the step of service modeling including service analysis and design based on ontology. Most enterprises know that the top-down approach is the best but they are hesitant to employ it because it requires them to invest a great deal of time and money without it showing any immediate results, particularly because they use well-defined component based systems. In this paper, we propose a service identification method to use a well-defined components maximally as a bottom-up approach. We assume that user's inputs generates events on a GUI and the approximate business process can be obtained from concatenating the event paths. We first find the core GUIs which have many outgoing event calls and form event paths by concatenating the event calls between the GUIs. Next, we adapt sequential pattern mining algorithms to find the maximal frequent event paths. As an experiment, we obtained business services with various granularity by applying a cohesion metric to extracted frequent event paths.

Building Energy Time Series Data Mining for Behavior Analytics and Forecasting Energy consumption

  • Balachander, K;Paulraj, D
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.1957-1980
    • /
    • 2021
  • The significant aim of this research has always been to evaluate the mechanism for efficient and inherently aware usage of vitality in-home devices, thus improving the information of smart metering systems with regard to the usage of selected homes and the time of use. Advances in information processing are commonly used to quantify gigantic building activity data steps to boost the activity efficiency of the building energy systems. Here, some smart data mining models are offered to measure, and predict the time series for energy in order to expose different ephemeral principles for using energy. Such considerations illustrate the use of machines in relation to time, such as day hour, time of day, week, month and year relationships within a family unit, which are key components in gathering and separating the effect of consumers behaviors in the use of energy and their pattern of energy prediction. It is necessary to determine the multiple relations through the usage of different appliances from simultaneous information flows. In comparison, specific relations among interval-based instances where multiple appliances use continue for certain duration are difficult to determine. In order to resolve these difficulties, an unsupervised energy time-series data clustering and a frequent pattern mining study as well as a deep learning technique for estimating energy use were presented. A broad test using true data sets that are rich in smart meter data were conducted. The exact results of the appliance designs that were recognized by the proposed model were filled out by Deep Convolutional Neural Networks (CNN) and Recurrent Neural Networks (LSTM and GRU) at each stage, with consolidated accuracy of 94.79%, 97.99%, 99.61%, for 25%, 50%, and 75%, respectively.

Design and Implementation of Sequential Pattern Miner to Analyze Alert Data Pattern (경보데이터 패턴 분석을 위한 순차 패턴 마이너 설계 및 구현)

  • Shin, Moon-Sun;Paik, Woo-Jin
    • Journal of Internet Computing and Services
    • /
    • v.10 no.2
    • /
    • pp.1-13
    • /
    • 2009
  • Intrusion detection is a process that identifies the attacks and responds to the malicious intrusion actions for the protection of the computer and the network resources. Due to the fast development of the Internet, the types of intrusions become more complex recently and need immediate and correct responses because the frequent occurrences of a new intrusion type rise rapidly. Therefore, to solve these problems of the intrusion detection systems, we propose a sequential pattern miner for analysis of the alert data in order to support intelligent and automatic detection of the intrusion. Sequential pattern mining is one of the methods to find the patterns among the extracted items that are frequent in the fixed sequences. We apply the prefixSpan algorithm to find out the alert sequences. This method can be used to predict the actions of the sequential patterns and to create the rules of the intrusions. In this paper, we propose an extended prefixSpan algorithm which is designed to consider the specific characteristics of the alert data. The extended sequential pattern miner will be used as a part of alert data analyzer of intrusion detection systems. By using the created rules from the sequential pattern miner, the HA(high-level alert analyzer) of PEP(policy enforcement point), usually called IDS, performs the prediction of the sequence behaviors and changing patterns that were not visibly checked.

  • PDF

Analysis of Traffic Card Big Data by Hadoop and Sequential Mining Technique (하둡과 순차패턴 마이닝 기술을 통한 교통카드 빅데이터 분석)

  • Kim, Woosaeng;Kim, Yong Hoon;Park, Hee-Sung;Park, Jin-Kyu
    • Journal of Information Technology Applications and Management
    • /
    • v.24 no.4
    • /
    • pp.187-196
    • /
    • 2017
  • It is urgent to prepare countermeasures for traffic congestion problems of Korea's metropolitan area where central functions such as economic, social, cultural, and education are excessively concentrated. Most users of public transportation in metropolitan areas including Seoul use the traffic cards. If various information is extracted from traffic big data produced by the traffic cards, they can provide basic data for transport policies, land usages, or facility plans. Therefore, in this study, we extract valuable information such as the subway passengers' frequent travel patterns from the big traffic data provided by the Seoul Metropolitan Government Big Data Campus. For this, we use a Hadoop (High-Availability Distributed Object-Oriented Platform) to preprocess the big data and store it into a Mongo database in order to analyze it by a sequential pattern data mining technique. Since we analysis the actual big data, that is, the traffic cards' data provided by the Seoul Metropolitan Government Big Data Campus, the analyzed results can be used as an important referenced data when the Seoul government makes a plan about the metropolitan traffic policies.

In-memory Compression Scheme Based on Incremental Frequent Patterns for Graph Streams (그래프 스트림 처리를 위한 점진적 빈발 패턴 기반 인-메모리 압축 기법)

  • Lee, Hyeon-Byeong;Shin, Bo-Kyoung;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.35-46
    • /
    • 2022
  • Recently, with the development of network technologies, as IoT and social network service applications have been actively used, a lot of graph stream data is being generated. In this paper, we propose a graph compression scheme that considers the stream graph environment by applying graph mining to the existing compression technique, which has been focused on compression rate and runtime. In this paper, we proposed Incremental frequent pattern based compression technique for graph streams. Since the proposed scheme keeps only the latest reference patterns, it increases the storage utilization and improves the query processing time. In order to show the superiority of the proposed scheme, various performance evaluations are performed in terms of compression rate and processing time compared to the existing method. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.

Multi-parametric Diagnosis Indexes and Emerging Pattern based Classification Technique for Diagnosing Cardiovascular Disease (심혈관계 질환 진단을 위한 복합 진단 지표와 출현 패턴 기반의 분류 기법)

  • Lee, Heon-Gyu;Noh, Ki-Yong;Ryu, Keun-Ho;Jung, Doo-Young
    • The KIPS Transactions:PartD
    • /
    • v.16D no.1
    • /
    • pp.11-26
    • /
    • 2009
  • In order to diagnose cardiovascular disease, we proposed EP-based(emerging pattern- based) classification technique using multi-parametric diagnosis indexes. We analyzed linear/nonlinear features of HRV for three recumbent postures and extracted four diagnosis indexes from ST-segments to apply the multi-parametric diagnosis indexes. In this paper, classification model using essential emerging patterns for diagnosing disease was applied. This classification technique discovers disease patterns of patient group and these emerging patterns are frequent in patients with cardiovascular disease but are not frequent in the normal group. To evaluate proposed classification algorithm, 120 patients with AP (angina pectrois), 13 patients with ACS(acute coronary syndrome) and 128 normal people data were used. As a result of classification, when multi-parametric indexes were used, the percent accuracy in classifying three groups was turned out to be about 88.3%.

A study of relationship between excrement and materia medica in Bangyakhappyeon based on the data mining analysis (데이터 마이닝을 이용한 대변과 약물간의 연관성 분석 -방약합편을 중심으로-)

  • Song, Young-Sup;Yang, Dong-Hoon;Park, Young-Jae;Park, Young-Bae
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.16 no.2
    • /
    • pp.33-46
    • /
    • 2012
  • Purpose : Nowadays excrement-related disease that repeats constipation and diarrhea is on the increase due to the change of dietary and lack of exercise, etc. We analyzed Bangyakhappyeon in order to find out the materia medica which is used for the excrement patterns. Methods : The database used in present thesisis consist of disease pattern, nature of medicinals and materia medica from Bangyakhappyeon was constructed. We analyzed the nature of medicinals of excrement patterns(or symptom) by frequency analysis and network analysis, and also searched main materia medica of excrement patterns(or symptom) by frequency analysis and rule mining. Results : We analyzed the nature of medicinals of excrement patterns(or symptom) in Bangyakhappyeon. And we researched the high frequency materia medica, high specificity materia medica and high frequent paired-drugs as main materia medica of excrement patterns(or symptom). Conclusion : This study found the information about frequency relationship between excrement patterns(or symptoms) and materia medica.

A Review of Window Query Processing for Data Streams

  • Kim, Hyeon Gyu;Kim, Myoung Ho
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.4
    • /
    • pp.220-230
    • /
    • 2013
  • In recent years, progress in hardware technology has resulted in the possibility of monitoring many events in real time. The volume of incoming data may be so large, that monitoring all individual data might be intractable. Revisiting any particular record can also be impossible in this environment. Therefore, many database schemes, such as aggregation, join, frequent pattern mining, and indexing, become more challenging in this context. This paper surveys the previous efforts to resolve these issues in processing data streams. The emphasis is on specifying and processing sliding window queries, which are supported in many stream processing engines. We also review the related work on stream query processing, including synopsis structures, plan sharing, operator scheduling, load shedding, and disorder control.

An Incremental Updating Algorithm of Sequential Patterns (점진적인 순차 패턴 갱신 알고리즘)

  • Kim Hak-Ja;Whang Whan-Kyu
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.5 s.311
    • /
    • pp.17-28
    • /
    • 2006
  • In this paper, we investigate a problem of updating sequential patterns when new transactions are added to a database. We present an efficient updating algorithm for sequential pattern mining that incrementally updates added transactions by reusing frequent patterns found previously. Our performance study shows that this method outperforms both AprioriAll and PrefixSpan algorithm which updates from scratch, since our method can efficiently utilize reduced candidate sets which result from the incremental updating technique.

An Implementation and Performance Characteristics of the FP-tree Association Rules Mining Algorithm (FP-tree 연관 규칙 탐사 알고리즘의 구현 및 성능 특성)

  • Lee, Hyung-Bong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.11a
    • /
    • pp.337-340
    • /
    • 2006
  • FP-tree(Frequent Pattern Tree) 연관 규칙 탐사 알고리즘은 DB 스캔에 대한 부담을 획기적으로 절감시킴으로써 전체적인 성능을 향상시키고자 제안되었다. 그런데, FP-tree는 DB에 저장된 거래 내용중 빈발 항목을 포함하는 모든 거래를 트리에 저장해야 하기 때문에 그만큼 많은 메모리를 필요로 한다. 이 논문에서는 범용 운영체제인 유닉스 시스템을 사용해서 메모리 사용 측면에서 F.P. Tree 알고리즘의 타당성과 이에 따른 성능 특성을 관찰하였다. 그 결과, F.P. Tree 알고리즘은 현대 컴퓨터에서 보편화된 512MB${\sim}$1GB의 주메모리 시스템에서 무리는 없으나, 메모리 소요량이 DB의 크기나 빈발 항목 집합의 수 보다는 거래의 길이 등 DB의 특성에 따라 급격하게 증가하는 것으로 나타났다.

  • PDF