• Title/Summary/Keyword: 순차정보 데이터 스트림

Search Result 18, Processing Time 0.033 seconds

Sequence Pattern Mining Using Meaning-based Transaction Structure for USN system (USN 환경에서 의미 기반 트랜잭션 구조를 이용한 순차 패턴 탐사 기법)

  • Choi, Pilsun;Kang, Donghyun;Kim, Hwan;Kim, Daein;Hwang, Buhyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.1105-1108
    • /
    • 2012
  • 순차 패턴 탐사 기법은 순서를 갖는 패턴들의 집합 중에 빈발하게 발생하는 패턴을 찾아내는 기법이다. USN 환경에서 발생하는 스트림 데이터는 시간 속성을 갖는 이벤트들의 집합으로 표현할 수 있으며 순차 패턴 탐사 기법을 이용하여 유용한 정보를 탐사할 수 있다. 그러나 스트림 데이터 환경에서는 데이터가 무한하고 연속적으로 발생하기 때문에 모든 데이터를 저장하여 패턴을 탐사하는 기법을 적용하는 데는 문제가 있다. 이 논문에서는 향상된 데이터 처리방식을 사용하여 순차패턴을 탐사하는 스트림 데이터 마이닝 기법에 대하여 제안한다. 제안하는 기법은 의미 단위의 가변적 윈도우를 사용하여 스트림 데이터로부터 트랜잭션을 생성하고 이 트랜잭션들의 집합을 해시와 슬라이딩 윈도우를 사용하여 스트림 데이터의 순차 패턴을 탐사한다. 이를 이용한 제안 기법은 실시간 시스템에 적합하게 데이터 저장 공간 사용의 효율성을 높이고 신속하게 유용한 패턴을 탐사할 수 있다.

Mining Frequent Sequential Patterns over Sequence Data Streams with a Gap-Constraint (순차 데이터 스트림에서 발생 간격 제한 조건을 활용한 빈발 순차 패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.9
    • /
    • pp.35-46
    • /
    • 2010
  • Sequential pattern mining is one of the essential data mining tasks, and it is widely used to analyze data generated in various application fields such as web-based applications, E-commerce, bioinformatics, and USN environments. Recently data generated in the application fields has been taking the form of continuous data streams rather than finite stored data sets. Considering the changes in the form of data, many researches have been actively performed to efficiently find sequential patterns over data streams. However, conventional researches focus on reducing processing time and memory usage in mining sequential patterns over a target data stream, so that a research on mining more interesting and useful sequential patterns that efficiently reflect the characteristics of the data stream has been attracting no attention. This paper proposes a mining method of sequential patterns over data streams with a gap constraint, which can help to find more interesting sequential patterns over the data streams. First, meanings of the gap for a sequential pattern and gap-constrained sequential patterns are defined, and subsequently a mining method for finding gap-constrained sequential patterns over a data stream is proposed.

Mining Interesting Sequential Pattern with a Time-interval Constraint for Efficient Analyzing a Web-Click Stream (웹 클릭 스트림의 효율적 분석을 위한 시간 간격 제한을 활용한 관심 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.16 no.2
    • /
    • pp.19-29
    • /
    • 2011
  • Due to the development of web technologies and the increasing use of smart devices such as smart phone, in recent various web services are widely used in many application fields. In this environment, the topic of supporting personalized and intelligent web services have been actively researched, and an analysis technique on a web-click stream generated from web usage logs is one of the essential techniques related to the topic. In this paper, for efficient analyzing a web-click stream of sequences, a sequential pattern mining technique is proposed, which satisfies the basic requirements for data stream processing and finds a refined mining result. For this purpose, a concept of interesting sequential patterns with a time-interval constraint is defined, which uses not on1y the order of items in a sequential pattern but also their generation times. In addition, A mining method to find the interesting sequential patterns efficiently over a data stream such as a web-click stream is proposed. The proposed method can be effectively used to various computing application fields such as E-commerce, bio-informatics, and USN environments, which generate data as a form of data streams.

RSP-DS: Real Time Sequential Patterns Analysis in Data Streams (RSP-DS: 데이터 스트림에서의 실시간 순차 패턴 분석)

  • Shin Jae-Jyn;Kim Ho-Seok;Kim Kyoung-Bae;Bae Hae-Young
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.9
    • /
    • pp.1118-1130
    • /
    • 2006
  • Existed pattern analysis algorithms in data streams environment have researched performance improvement and effective memory usage. But when new data streams come, existed pattern analysis algorithms have to analyze patterns again and have to generate pattern tree again. This approach needs many calculations in real situation that needs real time pattern analysis. This paper proposes a method that continuously analyzes patterns of incoming data streams in real time. This method analyzes patterns fast, and thereafter obtains real time patterns by updating previously analyzed patterns. The incoming data streams are divided into several sequences based on time based window. Informations of the sequences are inputted into a hash table. When the number of the sequences are over predefined bound, patterns are analyzed from the hash table. The patterns form a pattern tree, and later created new patterns update the pattern tree. In this way, real time patterns are always maintained in the pattern tree. During pattern analysis, suffixes of both new pattern and existed pattern in the tree can be same. Then a pointer is created from the new pattern to the existed pattern. This method reduce calculation time during duplicated pattern analysis. And old patterns in the tree are deleted easily by FIFO method. The advantage of our algorithm is proved by performance comparison with existed method, MILE, in a condition that pattern is changed continuously. And we look around performance variation by changing several variable in the algorithm.

  • PDF

A Sequential Pattern Mining based on Dynamic Weight in Data Stream (스트림 데이터에서 동적 가중치를 이용한 순차 패턴 탐사 기법)

  • Choi, Pilsun;Kim, Hwan;Kim, Daein;Hwang, Buhyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.137-144
    • /
    • 2013
  • A sequential pattern mining is finding out frequent patterns from the data set in time order. In this field, a dynamic weighted sequential pattern mining is applied to a computing environment that changes depending on the time and it can be utilized in a variety of environments applying changes of dynamic weight. In this paper, we propose a new sequence data mining method to explore the stream data by applying the dynamic weight. This method reduces the candidate patterns that must be navigated by using the dynamic weight according to the relative time sequence, and it can find out frequent sequence patterns quickly as the data input and output using a hash structure. Using this method reduces the memory usage and processing time more than applying the existing methods. We show the importance of dynamic weighted mining through the comparison of different weighting sequential pattern mining techniques.

Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach (발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.55-75
    • /
    • 2010
  • Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.

Predictive Convolutional Networks for Learning Stream Data (스트림 데이터 학습을 위한 예측적 컨볼루션 신경망)

  • Heo, Min-Oh;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.614-618
    • /
    • 2016
  • As information on the internet and the data from smart devices are growing, the amount of stream data is also increasing in the real world. The stream data, which is a potentially large data, requires online learnable models and algorithms. In this paper, we propose a novel class of models: predictive convolutional neural networks to be able to perform online learning. These models are designed to deal with longer patterns as the layers become higher due to layering convolutional operations: detection and max-pooling on the time axis. As a preliminary check of the concept, we chose two-month gathered GPS data sequence as an observation sequence. On learning them with the proposed method, we compared the original sequence and the regenerated sequence from the abstract information of the models. The result shows that the models can encode long-range patterns, and can generate a raw observation sequence within a low error.

Dualistic OSIC System for Layered Modulation (계층 변조 기술을 위한 이원적인 순차적 간섭 제거 시스템)

  • Hwang, Deok-Kyu
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.46 no.12
    • /
    • pp.89-94
    • /
    • 2009
  • A transceiver employing layered modulation encodes two data layers with different levels of robustness and selectively decodes one or both of them regarding the reception conditions. Because of the inconsistency in encoding and decoding, conventional ordered successive interference cancellation (OSIC) can not be compatible. In this letter, we derive a dualistic OSIC architecture, which assigns distinct criteria for demodulation and interference cancellation (IC). The proposed system achieves a higher spectral efficiency while guaranteeing reliable communication.

Substream-based out-of-sequence packet scheduling for streaming stored media (저장매체 스트리밍에서 substream에 기초한 비순차 패킷 스케줄링)

  • Choi Su Jeong;Ahn Hee June;Kang Sang Hyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.10C
    • /
    • pp.1469-1483
    • /
    • 2004
  • We propose a packet scheduling algorithms for streaming media. We assume that the receiver periodically reports back the channel throughput. From the original video data, the importance level of a video packet is determined by its relative position within its group of pictures, taking into account the motion-texture discrimination and temporal scalability. Thus, we generate a number of nested substreams. Using feedback information from the receiver and statistical characteristics of the video, we model the streaming system as a queueing system, compute the run-time decoding failure probability of a Same in each substream based on effective bandwidth approach, and determine the optimum substream to be sent at that moment in time. Since the optimum substream is updated periodically, the resulting sending order is different from the original playback order. From experiments with real video data, we show that our proposed scheduling scheme outperforms the conventional sequential sending scheme.

H*-tree/H*-cubing-cubing: Improved Data Cube Structure and Cubing Method for OLAP on Data Stream (H*-tree/H*-cubing: 데이터 스트림의 OLAP를 위한 향상된 데이터 큐브 구조 및 큐빙 기법)

  • Chen, Xiangrui;Li, Yan;Lee, Dong-Wook;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.475-486
    • /
    • 2009
  • Data cube plays an important role in multi-dimensional, multi-level data analysis. Meeting on-line analysis requirements of data stream, several cube structures have been proposed for OLAP on data stream, such as stream cube, flowcube, S-cube. Since it is costly to construct data cube and execute ad-hoc OLAP queries, more research works should be done considering efficient data structure, query method and algorithms. Stream cube uses H-cubing to compute selected cuboids and store the computed cells in an H-tree, which form the cuboids along popular-path. However, the H-tree layoutis disorderly and H-cubing method relies too much on popular path.In this paper, first, we propose $H^*$-tree, an improved data structure, which makes the retrieval operation in tree structure more efficient. Second, we propose an improved cubing method, $H^*$-cubing, with respect to computing the cuboids that cannot be retrieved along popular-path when an ad-hoc OLAP query is executed. $H^*$-tree construction and $H^*$-cubing algorithms are given. Performance study turns out that during the construction step, $H^*$-tree outperforms H-tree with a more desirable trade-off between time and memory usage, and $H^*$-cubing is better adapted to ad-hoc OLAP querieswith respect to the factors such as time and memory space.