• Title/Summary/Keyword: Stream Query Processing

Search Result 124, Processing Time 0.025 seconds

The Method for Real-time Complex Event Detection of Unstructured Big data (비정형 빅데이터의 실시간 복합 이벤트 탐지를 위한 기법)

  • Lee, Jun Heui;Baek, Sung Ha;Lee, Soon Jo;Bae, Hae Young
    • Spatial Information Research
    • /
    • v.20 no.5
    • /
    • pp.99-109
    • /
    • 2012
  • Recently, due to the growth of social media and spread of smart-phone, the amount of data has considerably increased by full use of SNS (Social Network Service). According to it, the Big Data concept is come up and many researchers are seeking solutions to make the best use of big data. To maximize the creative value of the big data held by many companies, it is required to combine them with existing data. The physical and theoretical storage structures of data sources are so different that a system which can integrate and manage them is needed. In order to process big data, MapReduce is developed as a system which has advantages over processing data fast by distributed processing. However, it is difficult to construct and store a system for all key words. Due to the process of storage and search, it is to some extent difficult to do real-time processing. And it makes extra expenses to process complex event without structure of processing different data. In order to solve this problem, the existing Complex Event Processing System is supposed to be used. When it comes to complex event processing system, it gets data from different sources and combines them with each other to make it possible to do complex event processing that is useful for real-time processing specially in stream data. Nevertheless, unstructured data based on text of SNS and internet articles is managed as text type and there is a need to compare strings every time the query processing should be done. And it results in poor performance. Therefore, we try to make it possible to manage unstructured data and do query process fast in complex event processing system. And we extend the data complex function for giving theoretical schema of string. It is completed by changing the string key word into integer type with filtering which uses keyword set. In addition, by using the Complex Event Processing System and processing stream data at real-time of in-memory, we try to reduce the time of reading the query processing after it is stored in the disk.

Efficient Data Management in RFID Applications

  • Cho, Yong-Jun;Bok, Kyoung-Soo;Park, Yong-Hun;Park, Hyeong-Soon;Park, Jun-Ho;Kang, Tae-Ho;Kim, Hak-Yong;Yoo, Jae-Soo
    • International Journal of Contents
    • /
    • v.5 no.1
    • /
    • pp.46-50
    • /
    • 2009
  • Logistics is in the limelight as one of a variety of RFID applications. The RFID technology is actively being applied to improve the competitiveness power of companies through the synthetic management of products and information. The RFID system generates large volume of stream data. It has problems which occur waste of storage and long processing time when storing large data and processing queries. Recently, many studies have been done to solve the problems which are generated in RFID system. In this thesis, we propose an efficient data management scheme for path queries and containment queries which are occurred frequently. The proposed data management scheme considers a change of the containment of products during a transport and supports a path of changed products by representing a path of various containments. Also, the compression utilizing the structure of supply chain reduces the stored data volumes. In order to show the superiority of our approach, we compare it with the existing schemes. As a result, our experimental results show that our scheme outperforms the existing scheme in terms of storage efficiency and query processing time.

A Dual Processing Load Shedding to Improve The Accuracy of Aggregate Queries on Clustering Environment of GeoSensor Data Stream (클러스터 환경에서 GeoSensor 스트림 데이터의 집계질의의 정확도 향상을 위한 이중처리 부하제한 기법)

  • Ji, Min-Sub;Lee, Yeon;Kim, Gyeong-Bae;Bae, Hae-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.1
    • /
    • pp.31-40
    • /
    • 2012
  • u-GIS DSMSs have been researched to deal with various sensor data from GeoSensors in ubiquitous environment. Also, they has been more important for high availability. The data from GeoSensors have some characteristics that increase explosively. This characteristic could lead memory overflow and data loss. To solve the problem, various load shedding methods have been researched. Traditional methods drop the overloaded tuples according to a particular criteria in a single server. Tuple deletion sensitive queries such as aggregation is hard to satisfy accuracy. In this paper a dual processing load shedding method is suggested to improve the accuracy of aggregation in clustering environment. In this method two nodes use replicated stream data for high availability. They process a stream in two nodes by using a characteristic they share stream data. Stream data are synchronized between them with a window as a unit. Then, processed results are merged. We gain improved query accuracy without data loss.

High-Performance Loading Method for Historical Spatial Query Processing in Data Stream System (데이터 스트림 시스템에서 과거 공간질의 처리를 위한 고속 로딩 기법)

  • Jae-Wan Shin;Sung-Ha Baek;Dong-Wook Lee;Soong-Sun Shin;Kyung-Bae Kim;Hae-Young Bae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.397-400
    • /
    • 2008
  • 무한히 발생되는 실시간 데이터와 디스크에 저장된 히스토리컬 데이터를 동시에 처리하는 하이브리드 질의에 관한 연구가 활발히 이루어지고 있다. 하이브리드 질의는 디스크에 저장된 대용량의 공간 데이터 처리를 위해 빠른 디스크 입/출력을 요구한다. 이러한 데이터를 처리하기 위해 인덱스, 데이터 축소 기법등이 연구되었다. 데이터의 빠른 검색을 위한 인덱스 기법은 디스크에 분산 저장된 데이터에 대한 탐색 비용과 입/출력 비용을 줄이지 못한다. 또한, 샘플링을 통해 디스크 입/출력 시간 비용을 줄이는 데이터 축소 기법은 데이터의 정확성을 떨어뜨려 정확성을 요구하는 하이브리드 질의에서는 이용하기가 어렵다. 이논문에서는 디스크 입/출력 시간과 디스크 탐색 시간 비용을 줄이고, 정확성을 보장하는 과거 공간질의 처리를 위한 고속로딩 기법을 제아난다. 제안기법은 공간을 그리드 형태로 나누고 인접한 공간 데이터를 함께 관리함으로써 디스크 입/출력 비용을 줄 일 수 있다. 또한, 공간적으로 인접한 데이터를 물리적으로 인접한 곳에 저장하여 디스크 탐색시간 비용을 줄일 수 있다. 이렇게 저장된 데이터는 손실 없이 모두 저장되며, 정확성 또는 보장할 수 있다.

Grouping Method Based Query Range Density for Efficient Operation Sharing of Spatial Range Query (공간영역질의의 효율적인 연산 공유를 위한 질의영역 밀집도 기반의 그룹화 기법)

  • Lim, Jung-Hyeun;Shin, Soong-Sun;Baek, Sung-Ha;Lee, Dong-Wook;Kim, Kyung-Bae;Bae, Hae-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.348-351
    • /
    • 2009
  • 유비쿼터스 사회를 실현하는 핵심기술인 u-GIS 공간정보 기술은 데이터 스트림 처리 시스템(Data Stream Management System)과 지리정보 시스템(Geography Information System)이 결합된 플랫폼인 u-GIS DSMS를 요구한다. u-GIS DSMS는 GeoSeonsor에서 수집되는 센서 테이터와 GIS의 공간정보 데이터를 결합하여 처리하는 공간영역질의가 다수 요구된다. 이런 공간영역질의들은 특정 지역에 밀집하게 등록되는 경향이 있으며, 유사한 프리디킷을 가질 가능성이 높다. 이러한 특징은 공간영역질의가 특정 지역에 밀집되면 다수의 비슷한 연산들이 반복적으로 처리하기 때문에 시스템 성능이 저하 될 것이다. 이를 해결하기 위해 영역질의 색인기법 연구가 활발히 진행되고 있다. 그러나 기존의 VCR-Index와 CQI-Index 기법은 질의영역을 셀 구조나 가상구조로 분할하여 처리하기 때문에 자원 및 연산을 공유 할 수 없어 질의 처리 속도가 현저히 저하되기 때문에 대량의 공간영역질의 처리에는 부적합하다. 그래서 본 논문에서는 공간영역질의의 효율적인 연산 공유를 위한 질의영역 밀집도 기반의 그룹화 기법을 제안한다. 이 기법은 질의영역의 밀집도를 이용하여 공간영역질의들을 그룹화 후 색인을 구성한다. 색인된 영역들의 데이터는 단일 큐로 구성 후 질의들의 프리디킷을 분석하여 자원 및 연산 공유기법을 통해 기존의 기법보다 처리 속도 향상 및 메모리 사용을 감소시켰다.

Using Skylines on Wavelet Synopses for CKNN Queries over Distributed Streams Processing

  • Wang, Ling;Zhou, TieHua;Kim, Kwang-Deuk;Lee, Yang-Koo;Ryu, Keun-Ho
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.7-12
    • /
    • 2009
  • In this paper, we discuss the problem of continuous k.nearest neighbors (CKNN) monitoring over distributed streams wavelet synopses, which also considered sliding window structure under stream based kNN query. We developed traditional skylines techniques and propose a new method which called DR.skylines to process CKNN queries as a bandwidth.efficient approach. It tries to process CKNN queries on synopses for optimized sliding window time and space computation.

  • PDF

Supporting Sliding Windows of Trigger for Continuous Query Processing System (연속질의 처리 시스템을 위한 트리거의 슬라이딩 윈도우 지원)

  • Lee, Keun-Joo;Jin, S.I.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10c
    • /
    • pp.171-176
    • /
    • 2006
  • 데이터 스트림(data stream)을 처리하기 위해서는 기본적으로 질의 대상이 되는 슬라이딩 윈도우에 대한 지원과 이에 대한 연속질의를 수행할 수 있어야 한다. 기존의 관계형 DBMS는 성능 문제로 인하여 데이터 스트림 처리에 한계가 있었으나 고성능 메인메모리 DBMS의 등장으로 빈번히 발생하는 스트림에 대한 충분한 질의 처리 능력을 갖추게 되었다. 본 논문에서는 메인메모리 DBMS기반에서의 데이터 스트림에 대한 연속질의 처리를 위해서 새로운 접근방법을 제공한다. 즉. 고성능 메인 메모리 DBMS의 높은 삽입과 갱신 성능을 전제로 트리거를 통한 슬라이딩 윈도우의 지원방법을 제시하고. 윈도우에 대한 연속질의는 응용에서 지원하되 효율적인 질의처리를 위해 저장프로시저를 적용한다. 이러한 메커니즘의 연속질의 처리 시스템은 CQL에서 정의한 세 가지 윈도우 유형을 모두 지원할 수 있다.

  • PDF

Design and Implementation of the Spatio-Temporal DSMS for Moving Object Data Streams (이동체 데이타 스트림을 위한 시공간 DSMS의 설계 및 구현)

  • Lee, Ki-Young;Kim, Joung-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.5
    • /
    • pp.159-166
    • /
    • 2008
  • Recently, according to the rapid development of location positioning technology and wireless communications technology and increasement of usage of moving object data, many researches and developments on the real-time locating systems which provides real time service of moving object data stream are under proceeding. However, MO (Moving Object) DBMS used based system in the in these systems is the inefficient management of moving object data streams, and the existing DSMS (Data Stream Management System) has problems that spatio-temporal data are not handled efficiently. Therefore, in this thesis, we designed and implemented spatio-temporal DSMS for efficient real-time management of moving object data stream. This thesis implemented spatio-temporal DSMS based STREAM (STanford stREam dAta Manager) of Stanford University is supporting real-time management of moving object data stream and spatio-temproal query processing and filtering for reduce the input loading. Specifically, spatio-temporal operators of the spatio-temporal DSMS support standard interface of SQL form which extended "Simple Feature Specification for SQL" standard specifications presented by OGC for compatibility. Finally, implemented spatio-temporal DSMS in this thesis, proved the effectiveness of the system that as applied real-time monitoring areas that require real-time locating of object data stream DSMS.

  • PDF

Optimizing Skyline Query Processing Algorithms on CUDA Framework (CUDA 프레임워크 상에서 스카이라인 질의처리 알고리즘 최적화)

  • Min, Jun;Han, Hwan-Soo;Lee, Sang-Won
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.275-284
    • /
    • 2010
  • GPUs are stream processors based on multi-cores, which can process large data with a high speed and a large memory bandwidth. Furthermore, GPUs are less expensive than multi-core CPUs. Recently, usage of GPUs in general purpose computing has been wide spread. The CUDA architecture from Nvidia is one of efforts to help developers use GPUs in their application domains. In this paper, we propose techniques to parallelize a skyline algorithm which uses a simple nested loop structure. In order to employ the CUDA programming model, we apply our optimization techniques to make our skyline algorithm fit into the performance restrictions of the CUDA architecture. According to our experimental results, we improve the original skyline algorithm by 80% with our optimization techniques.

Shot boundary Frame Detection and Key Frame Detection for Multimedia Retrieval (멀티미디어 검색을 위한 shot 경계 및 대표 프레임 추출)

  • 강대성;김영호
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.1
    • /
    • pp.38-43
    • /
    • 2001
  • This Paper suggests a new feature for shot detection, using the proposed robust feature from the DC image constructed by DCT DC coefficients in the MPEG video stream, and proposes the characterizing value that reflects the characteristic of kind of video (movie, drama, news, music video etc.). The key frames are pulled out from many frames by using the local minima and maxima of differential of the value. After original frame(not do image) are reconstructed for key frame, indexing process is performed through computing parameters. Key frames that are similar to user's query image are retrieved through computing parameters. It is proved that the proposed methods are better than conventional method from experiments. The retrieval accuracy rate is so high in experiments.

  • PDF