• Title/Summary/Keyword: Distributed Stream Processing

Search Result 55, Processing Time 0.023 seconds

Spatial Operation Allocation Scheme over Common Query Regions for Distributed Spatial Data Stream Processing (분산 공간 데이터 스트림 처리에서 질의 영역의 겹침을 고려한 공간 연산 배치 기법)

  • Chung, Weon-Il
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.6
    • /
    • pp.2713-2719
    • /
    • 2012
  • According to increasing of various location-based services, distributed data stream processing techniques have been widely studied to provide high scalability and availability. In previous researches, in order to balance the load of distributed nodes, the geographic characteristics of spatial data stream are not considered. For this reason, distributed operations for adjacent spatial regions increases the overall system load. We propose a operation allocation scheme considering the characteristics of spatial operations to effectively processing spatial data stream in distributed computing environments. The proposed method presents the efficient share maximizing approach that preferentially distributes spatial operations sharing the common query regions to the same node in order to separate the adjacent spatial operations on overlapped regions.

DART: Fast and Efficient Distributed Stream Processing Framework for Internet of Things

  • Choi, Jang-Ho;Park, Junyong;Park, Hwin Dol;Min, Ok-gee
    • ETRI Journal
    • /
    • v.39 no.2
    • /
    • pp.202-212
    • /
    • 2017
  • With the advent of the Internet-of-Things paradigm, the amount of data production has grown exponentially and the user demand for responsive consumption of data has increased significantly. Herein, we present DART, a fast and lightweight stream processing framework for the IoT environment. Because the DART framework targets a geospatially distributed environment of heterogeneous devices, the framework provides (1) an end-user tool for device registration and application authoring, (2) automatic worker node monitoring and task allocations, and (3) runtime management of user applications with fault tolerance. To maximize performance, the DART framework adopts an actor model in which applications are segmented into microtasks and assigned to an actor following a single responsibility. To prove the feasibility of the proposed framework, we implemented the DART system. We also conducted experiments to show that the system can significantly reduce computing burdens and alleviate network load by utilizing the idle resources of intermediate edge devices.

A Study on the Design an Implementation Method of Computational Object Supporting CM Stream Interface in the Distributed Environment (분산 환경에서 CM 스트림 인터페이스를 지원하는 계산 객체의 설계 및 구현 방안 연구)

  • Song, Byeong-Gwon;Jin, Myeong-Suk;Kim, Geon-Ung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.6
    • /
    • pp.1785-1794
    • /
    • 2000
  • This paper presents a computational object model supporting CM(Continuous Media) stream interfaces including QoS(Quality of Service) required in the distributed application method for the proposed stream interface including QoS. A stream interface consists of a data channel and a control channel. In this paper, the CORBA supporting communication channel is used as the control channel and various transport protocols can be used as the dta channel of the stream interface. Also, specifications of the application QoS are included in stream interface specification. In implementation, FIFO queues and timers are used to support transmission rate, delay and jitter control mechanisms of he stream interface.

  • PDF

Real-Time IoT Big-data Processing for Stream Reasoning (스트림-리즈닝을 위한 실시간 사물인터넷 빅-데이터 처리)

  • Yun, Chang Ho;Park, Jong Won;Jung, Hae Sun;Lee, Yong Woo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.1-9
    • /
    • 2017
  • Smart Cities intelligently manage numerous infrastructures, including Smart-City IoT devices, and provide a variety of smart-city applications to citizen. In order to provide various information needed for smart-city applications, Smart Cities require a function to intelligently process large-scale streamed big data that are constantly generated from a large number of IoT devices. To provide smart services in Smart-City, the Smart-City Consortium uses stream reasoning. Our stream reasoning requires real-time processing of big data. However, there are limitations associated with real-time processing of large-scale streamed big data in Smart Cities. In this paper, we introduce one of our researches on cloud computing based real-time distributed-parallel-processing to be used in stream-reasoning of IoT big data in Smart Cities. The Smart-City Consortium introduced its previously developed smart-city middleware. In the research for this paper, we made cloud computing based real-time distributed-parallel-processing available in the cloud computing platform of the smart-city middleware developed in the previous research, so that we can perform real-time distributed-parallel-processing with them. This paper introduces a real-time distributed-parallel-processing method and system for stream reasoning with IoT big data transmitted from various sensors of Smart Cities and evaluate the performance of real-time distributed-parallel-processing of the system where the method is implemented.

Design and Implementation of a Distributed Audio/Video Stream Service Framework based on CORBA (CORBA 기반의 분산 오디오/비디오 스트림 서비스 프레임워크의 설계 및 구현)

  • Kim, Jong-Hyeon;No, Yeong-Uk;Jeong, Gi-Dong
    • The KIPS Transactions:PartA
    • /
    • v.9A no.2
    • /
    • pp.207-216
    • /
    • 2002
  • This paper present a design and implementation of a distributed audio, Video stream service framework based on CORBA for efficient processing and control of audio/video stream. We design software components which support processing, control and transmission of audio/video streams as distributed objects. For optimization of stream transmission performance, we separate the transmission path of control data and media data. Distributed objects are defined by IDL and implemented using JAVA. And device dependent facilities like media capturing, playing and communication channels are implemented using JMF (Java Media Framework) components. We show a connection establishment and control procedure of streams communication. And for evaluation, we implement a test system and experiment a system performance. Our experiments show that test system has somewhat longer connection latency time compared to TCP connection establishment, but has optimized media transmission time compared to CORBA IIOP. Also test system show acceptable service quality of media transmission.

Approximate Top-k Subgraph Matching Scheme Considering Data Reuse in Large Graph Stream Environments (대용량 그래프 스트림 환경에서 데이터 재사용을 고려한 근사 Top-k 서브 그래프 매칭 기법)

  • Choi, Do-Jin;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.8
    • /
    • pp.42-53
    • /
    • 2020
  • With the development of social network services, graph structures have been utilized to represent relationships among objects in various applications. Recently, a demand of subgraph matching in real-time graph streams has been increased. Therefore, an efficient approximate Top-k subgraph matching scheme for low latency in real-time graph streams is required. In this paper, we propose an approximate Top-k subgraph matching scheme considering data reuse in graph stream environments. The proposed scheme utilizes the distributed stream processing platform, called Storm to handle a large amount of stream data. We also utilize an existing data reuse scheme to decrease stream processing costs. We propose a distance based summary indexing technique to generate Top-k subgraph matching results. The proposed summary indexing technique costs very low since it only stores distances among vertices that are selected in advance. Finally, we provide k subgraph matching results to users by performing an approximate Top-k matching on the summary indexing. In order to show the superiority of the proposed scheme, we conduct various performance evaluations in diverse real world datasets.

Adaptive Upstream Backup Scheme based on Throughput Rate in Distributed Spatial Data Stream System (분산 공간 데이터 스트림 시스템에서 연산 처리율 기반의 적응적 업스트림 백업 기법)

  • Jeong, Weonil
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.10
    • /
    • pp.5156-5161
    • /
    • 2013
  • In distributed spatial data stream processing, processed tuples of downstream nodes are replicated to the upstream node in order to increase the utilization of distributed nodes and to recover the whole system for the case of system failure. However, while the data input rate increases and multiple downstream nodes share the operation result of the upstream node, the data which stores to output queues as a backup can be lost since the deletion operation delay may be occurred by the delay of the tuple processing of upstream node. In this paper, the adaptive upstream backup scheme based on operation throughput in distributed spatial data stream system is proposed. This method can cut down the average load rate of nodes by efficient spatial operation migration as it processes spatial temporal data stream, and it can minimize the data loss by fluid change of backup mode. The experiments show the proposed approach can prevent data loss and can decrease, on average, 20% of CPU utilization by node monitoring.

Load Balancing for Distributed Processing of Real-time Spatial Big Data Stream (실시간 공간 빅데이터 스트림 분산 처리를 위한 부하 균형화 방법)

  • Yoon, Susik;Lee, Jae-Gil
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1209-1218
    • /
    • 2017
  • A variety of sensors is widely used these days, and it has become much easier to acquire spatial big data streams from various sources. Since spatial data streams have inherently skewed and dynamically changing distributions, the system must effectively distribute the load among workers. Previous studies to solve this load imbalance problem are not directly applicable to processing spatial data. In this research, we propose Adaptive Spatial Key Grouping (ASKG). The main idea of ASKG is, by utilizing the previous distribution of the data streams, to adaptively suggest a new grouping scheme that evenly distributes the future load among workers. We evaluate the validity of the proposed algorithm in various environments, by conducting an experiment with real datasets while varying the number of workers, input rate, and processing overhead. Compared to two other alternative algorithms, ASKG improves the system performance in terms of load imbalance, throughput, and latency.

A holistic distributed clustering algorithm based on sensor network (센서 네트워크 기반의 홀리스틱 분산 클러스터링 알고리즘)

  • Chen Ping;Kee-Wook Rim;Nam Ji-Yeun;Lee KyungOh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.874-877
    • /
    • 2008
  • Nowadays the existing data processing systems can only support some simple query for sensor network. It is increasingly important to process the vast data streams in sensor network, and achieve effective acknowledges for users. In this paper, we propose a holistic distributed k-means algorithm for sensor network. In order to verify the effectiveness of this method, we compare it with central k-means algorithm to process the data streams in sensor network. From the evaluation experiments, we can verify that the proposed algorithm is highly capable of processing vast data stream with less computation time. This algorithm prefers to cluster the data streams at the distributed nodes, and therefore it largely reduces redundant data communications compared to the central processing algorithm.

An Efficient Cache Mechanism for Improving Response Times in Integrated RFID Middleware (통합 RFID 미들웨어의 응답시간 개선을 위한 효과적인 캐쉬 구조 설계)

  • Kim, Cheong-Ghil;Lee, Jun-Hwan;Park, Kyung-Lang;Kim, Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.15A no.1
    • /
    • pp.17-26
    • /
    • 2008
  • This paper proposes an efficient caching mechanism appropriate for the integrated RFID middleware which can integrate wireless sensor networks (WSNs) and RFID (radio frequency identification) systems. The operating environment of the integrated RFID middleware is expected to face the situations of a significant amount of data reading from RFID readers, constant stream data input from large numbers of autonomous sensor nodes, and queries from various applications to history data sensed before and stored in distributed storages. Consequently, an efficient middleware layer equipping with caching mechanism is inevitably necessary for low latency of request-response while processing both data stream from sensor networks and history data from distributed database. For this purpose, the proposed caching mechanism includes two optimization methods to reduce the overhead of data processing in RFID middleware based on the classical cache implementation polices. One is data stream cache (DSC) and the other is history data cache (HDC), according to the structure of data request. We conduct a number of simulation experiments under different parameters and the results show that the proposed caching mechanism contributes considerably to fast request-response times.