• Title/Summary/Keyword: Multiple Stream Data

Search Result 176, Processing Time 0.04 seconds

Finding Pseudo Periods over Data Streams based on Multiple Hash Functions (다중 해시함수 기반 데이터 스트림에서의 아이템 의사 주기 탐사 기법)

  • Lee, Hak-Joo;Kim, Jae-Wan;Lee, Won-Suk
    • Journal of Information Technology Services
    • /
    • v.16 no.1
    • /
    • pp.73-82
    • /
    • 2017
  • Recently in-memory data stream processing has been actively applied to various subjects such as query processing, OLAP, data mining, i.e., frequent item sets, association rules, clustering. However, finding regular periodic patterns of events in an infinite data stream gets less attention. Most researches about finding periods use autocorrelation functions to find certain changes in periodic patterns, not period itself. And they usually find periodic patterns in time-series databases, not in data streams. Literally a period means the length or era of time that some phenomenon recur in a certain time interval. However in real applications a data set indeed evolves with tiny differences as time elapses. This kind of a period is called as a pseudo-period. This paper proposes a new scheme called FPMH (Finding Periods using Multiple Hash functions) algorithm to find such a set of pseudo-periods over a data stream based on multiple hash functions. According to the type of pseudo period, this paper categorizes FPMH into three, FPMH-E, FPMH-PC, FPMH-PP. To maximize the performance of the algorithm in the data stream environment and to keep most recent periodic patterns in memory, we applied decay mechanism to FPMH algorithms. FPMH algorithm minimizes the usage of memory as well as processing time with acceptable accuracy.

Attribute-based Approach for Multiple Continuous Queries over Data Streams (데이터 스트림 상에서 다중 연속 질의 처리를 위한 속성기반 접근 기법)

  • Lee, Hyun-Ho;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.14D no.5
    • /
    • pp.459-470
    • /
    • 2007
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Query processing for such a data stream should also be continuous and rapid, which requires strict time and space constraints. In most DSMS(Data Stream Management System), the selection predicates of continuous queries are grouped or indexed to guarantee these constraints. This paper proposes a new scheme tailed an ASC(Attribute Selection Construct) that collectively evaluates selection predicates containing the same attribute in multiple continuous queries. An ASC contains valuable information, such as attribute usage status, partially pre calculated matching results and selectivity statistics for its multiple selection predicates. The processing order of those ASC's that are corresponding to the attributes of a base data stream can significantly influence the overall performance of multiple query evaluation. Consequently, a method of establishing an efficient evaluation order of multiple ASC's is also proposed. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

Stream Data Processing based on Sliding Window at u-Health System (u-Health 시스템에서 슬라이딩 윈도우 기반 스트림 데이터 처리)

  • Kim, Tae-Yeun;Song, Byoung-Ho;Bae, Sang-Hyun
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.4 no.2
    • /
    • pp.103-110
    • /
    • 2011
  • It is necessary to accurate and efficient management for measured digital data from sensors in u-health system. It is not efficient that sensor network process input stream data of mass storage stored in database the same time. We propose to improve the processing performance of multidimensional stream data continuous incoming from multiple sensor. We propose process query based on sliding window for efficient input stream and found multiple query plan to Mjoin method and we reduce stored data using backpropagation algorithm. As a result, we obtained to efficient result about 18.3% reduction rate of database using 14,324 data sets.

Performance Evaluation and Analysis of Multiple Scenarios of Big Data Stream Computing on Storm Platform

  • Sun, Dawei;Yan, Hongbin;Gao, Shang;Zhou, Zhangbing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.2977-2997
    • /
    • 2018
  • In big data era, fresh data grows rapidly every day. More than 30,000 gigabytes of data are created every second and the rate is accelerating. Many organizations rely heavily on real time streaming, while big data stream computing helps them spot opportunities and risks from real time big data. Storm, one of the most common online stream computing platforms, has been used for big data stream computing, with response time ranging from milliseconds to sub-seconds. The performance of Storm plays a crucial role in different application scenarios, however, few studies were conducted to evaluate the performance of Storm. In this paper, we investigate the performance of Storm under different application scenarios. Our experimental results show that throughput and latency of Storm are greatly affected by the number of instances of each vertex in task topology, and the number of available resources in data center. The fault-tolerant mechanism of Storm works well in most big data stream computing environments. As a result, it is suggested that a dynamic topology, an elastic scheduling framework, and a memory based fault-tolerant mechanism are necessary for providing high throughput and low latency services on Storm platform.

Continuous Multiple Prediction of Stream Data Based on Hierarchical Temporal Memory Network (계층형 시간적 메모리 네트워크를 기반으로 한 스트림 데이터의 연속 다중 예측)

  • Han, Chang-Yeong;Kim, Sung-Jin;Kang, Hyun-Syug
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.1 no.1
    • /
    • pp.11-20
    • /
    • 2012
  • Stream data shows a sequence of values changing continuously over time. Due to the nature of stream data, its trend is continuously changing according to various time intervals. Therefore the prediction of stream data must be carried out simultaneously with respect to multiple intervals, i.e. Continuous Multiple Prediction(CMP). In this paper, we propose a Continuous Integrated Hierarchical Temporal Memory (CIHTM) network for CMP based on the Hierarchical Temporal Memory (HTM) model which is a neocortex leraning algorithm. To develop the CIHTM network, we created three kinds of new modules: Shift Vector Senor, Spatio-Temporal Classifier and Multiple Integrator. And also we developed learning and inferencing algorithm of CIHTM network.

MMJoin: An Optimization Technique for Multiple Continuous MJoins over Data Streams (데이타 스트림 상에서 다중 연속 복수 조인 질의 처리 최적화 기법)

  • Byun, Chang-Woo;Lee, Hun-Zu;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.35 no.1
    • /
    • pp.1-16
    • /
    • 2008
  • Join queries having heavy cost are necessary to Data Stream Management System in Sensor Network where plural short information is generated. It is reasonable that each join operator has a sliding-window constraint for preventing DISK I/O because the data stream represents the infinite size of data. In addition, the join operator should be able to take multiple inputs for overall results. It is possible for the MJoin operator with sliding-windows to do so. In this paper, we consider the data stream environment where multiple MJoin operators are registered and propose MMJoin which deals with issues of building and processing a globally shared query considering characteristics of the MJoin operator with sliding-windows. First, we propose a solution of building the global shared query execution plan. Second, we solved the problems of updating a window size and routing for a join result. Our study can be utilized as a fundamental research for an optimization technique for multiple continuous joins in the data stream environment.

DISSECTION TECHNIQUE FOR EFFICIENT JOIN OPERATION ON SEMI-STRUCTURED DOCUMENT STREAM

  • Seo, Dong-Hyeok;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.11-13
    • /
    • 2007
  • There has been much interest in stream query processing. Various index techniques and advanced join techniques have been proposed to efficiently process data stream queries. Previous proposals support rapid and advanced response to the data stream queries. However, the amount of data stream is increasing and the data stream query processing needs more speedup than before. In this paper, we proposed novel query processing techniques for large number of incoming documents stream. We proposed Dissection Technique for efficient query processing in the data stream environment. We focused on the dissection technique in join query processing. Our technique shows efficient operation performance comparing with the other proposal in the data stream. Proposed technique is applied to the sensor network system and XML database.

  • PDF

Efficient Processing of Multidimensional Sensor stream Data in Digital Marine Vessel (디지털 선박 내 다차원 센서 스트림 데이터의 효율적인 처리)

  • Song, Byoung-Ho;Park, Kyung-Woo;Lee, Jin-Seok;Lee, Keong-Hyo;Jung, Min-A;Lee, Sung-Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.5B
    • /
    • pp.794-800
    • /
    • 2010
  • It is necessary to accurate and efficient management for measured digital data from various sensors in digital marine vessel. It is not efficient that sensor network process input stream data of mass storage stored in database the same time. In this paper, We propose to improve the processing performance of multidimensional stream data continuous incoming from multiple sensor. We propose that we arrange some sensors (temperature, humidity, lighting, voice) and process query based on sliding window for efficient input stream and found multiple query plan to Mjoin method and we reduce stored data using SVM algorithm. We automatically delete that it isn't necessary to the data from the database and we used to ship diagnosis system for available data. As a result, we obtained to efficient result about 18.3% reduction rate of database using 35,912 data sets.

Data Stream Allocation Algorithm for Maximizing Sum Capacity in Multiuser MIMO Systems (다중 사용자 MIMO 시스템에서 전체 채널 용량을 최대화하기 위한 데이터 스트림 할당 기법)

  • Kim, Bong-Seok;Choi, Kwon-Hue
    • Journal of Satellite, Information and Communications
    • /
    • v.6 no.1
    • /
    • pp.19-27
    • /
    • 2011
  • In this paper, we propose the data stream allocation algorithms for maximizing sum capacity of downlink multiuser MIMO (Multiple-input Multiple-output) systems with BD (Block Diagonalization). The conventional BD precoding algorithms maximize the capacity by controlling power against channel gain of each user. In multiuser MIMO systems, however, the number of data streams for each user can be used to as another control parameter, which determines the capacity. This paper proposes the data stream allocation algorithm of BD for increasing capacity in multiuser MIMO systems. The proposed algorithm allocates unequal bit stream to each user based on channel matrix of each user for maximizing sum capacity. It is proved that proposed algorithm can achieve the significantly improved sum capacity by computer simulation.

Data Stream Allocation for Fair Performance in Multiuser MIMO Systems (다중 사용자 MIMO 환경에서 균등한 성능을 보장하는 데이터 스트림 할당 기법)

  • Lim, Dong-Ho;Choi, Kwon-Hue
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.12A
    • /
    • pp.1006-1013
    • /
    • 2009
  • This paper proposes a data stream allocation technique for fair capacity performance in multiuser multiple-input multiple-output (MIMO) systems using block diagonalization (BD) algorithm. Conventional studies have been focused on maximum sum capacity. Thus, there is a very large difference of capacity among users, since user capacity unfairly distributed according to each user channel environment. In additional, poor channel user has very small capacity, since base station allocates the power by using water-filling technique. Also, almost studies limited itself to obtain the additional gain by using the same number of data streams for all users. In this paper, we propose the technique for maximizing sum capacity under the fair performance constraint by allocating data stream according to user channel environment. Also, proposed algorithm has more gain of sum capacity and transmit power than conventional equal allocation via computer simulation.