• Title/Summary/Keyword: Stream Data

Search Result 2,514, Processing Time 0.031 seconds

MMJoin: An Optimization Technique for Multiple Continuous MJoins over Data Streams (데이타 스트림 상에서 다중 연속 복수 조인 질의 처리 최적화 기법)

  • Byun, Chang-Woo;Lee, Hun-Zu;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.35 no.1
    • /
    • pp.1-16
    • /
    • 2008
  • Join queries having heavy cost are necessary to Data Stream Management System in Sensor Network where plural short information is generated. It is reasonable that each join operator has a sliding-window constraint for preventing DISK I/O because the data stream represents the infinite size of data. In addition, the join operator should be able to take multiple inputs for overall results. It is possible for the MJoin operator with sliding-windows to do so. In this paper, we consider the data stream environment where multiple MJoin operators are registered and propose MMJoin which deals with issues of building and processing a globally shared query considering characteristics of the MJoin operator with sliding-windows. First, we propose a solution of building the global shared query execution plan. Second, we solved the problems of updating a window size and routing for a join result. Our study can be utilized as a fundamental research for an optimization technique for multiple continuous joins in the data stream environment.

DESIGN AND IMPLEMENTATION OF METADATA MODEL FOR SENSOR DATA STREAM

  • Lee, Yang-Koo;Jung, Young-Jin;Ryu, Keun-Ho;Kim, Kwang-Deuk
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.768-771
    • /
    • 2006
  • In WSN(Wireless Sensor Network) environment, a large amount of sensors, which are small and heterogeneous, generates data stream successively in physical space. These sensors are composed of measured data and metadata. Metadata includes various features such as location, sampling time, measurement unit, and their types. Until now, wireless sensors have been managed with individual specification, not the explicit standardization of metadata, so it is difficult to collect and communicate between heterogeneous sensors. To solve this problem, OGC(Open Geospatial Consortium) has proposed a SensorML(Sensor Model Language) which can manage metadata of heterogeneous sensors with unique format. In this paper, we introduce a metadata model using SensorML specification to manage various sensors, which are distributed in a wide scope. In addition, we implement the metadata management module applied to the sensor data stream management system. We provide many functions, namely generating metadata file, registering and storing them according to definition of SensorML.

  • PDF

Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach (발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.55-75
    • /
    • 2010
  • Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.

Optimizing Multi-way Join Query Over Data Streams (데이타 스트림에서의 다중 조인 질의 최적화 방법)

  • Park, Hong-Kyu;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.35 no.6
    • /
    • pp.459-468
    • /
    • 2008
  • A data stream which is a massive unbounded sequence of data elements continuously generated at a rapid rate. Many recent research activities for emerging applications often need to deal with the data stream. Such applications can be web click monitoring, sensor data processing, network traffic analysis. telephone records and multi-media data. For this. data processing over a data stream are not performed on the stored data but performed the newly updated data with pre-registered queries, and then return a result immediately or periodically. Recently, many studies are focused on dealing with a data stream more than a stored data set. Especially. there are many researches to optimize continuous queries in order to perform them efficiently. This paper proposes a query optimization algorithm to manage continuous query which has multiple join operators(Multi-way join) over data streams. It is called by an Extended Greedy query optimization based on a greedy algorithm. It defines a join cost by a required operation to compute a join and an operation to process a result and then stores all information for computing join cost and join cost in the statistics catalog. To overcome a weak point of greedy algorithm which has poor performance, the algorithm selects the set of operators with a small lay, instead of operator with the smallest cost. The set is influenced the accuracy and execution time of the algorithm and can be controlled adaptively by two user-defined values. Experiment results illustrate the performance of the EGA algorithm in various stream environments.

Evaluation of Hydrological Impacts Caused by Land Use Change (토지이용변화에 따른 수문영향분석)

  • Park, Jin-Yong
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.44 no.5
    • /
    • pp.54-66
    • /
    • 2002
  • A grid-based hydrological model, CELTHYM, capable of estimating base flow and surface runoff using only readily available data, was used to assess hydrologic impacts caused by land use change on Little Eagle Creek (LEC) in Central Indiana. Using time periods when land use data are available, the model was calibrated with two years of observed stream flow data, 1983-1984, and verified by comparison of model predictions with observed stream flow data for 1972-1974 and 1990-1992. Stream flow data were separated into direct runoff and base flow using HYSEP (USGS) to estimate the impacts of urbanization on each hydrologic component. Analysis of the ratio between direct runoff and total runoff from simulation results, and the change in these ratios with land use change, shows that the ratio of direct runoff increases proportionally with increasing urban area. The ratio of direct runoff also varies with annual rainfall, with dry year ratios larger than those for wet years shows that urbanization might be more harmful during dry years than abundant rainfall years in terms of water yield and water quality management.

Forecasting of Stream Qualities in Gumho River by Exponential Smoothing at Gumho2 Measurement Point using Monthly Time Series Data

  • Song, Phil-Jun;Lee, Bo-Ra;Kim, Jin-Yong;Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.3
    • /
    • pp.609-617
    • /
    • 2007
  • The goal of this study is to forecast the trend of stream quality and to suggest some policy alternatives in Gumbo river. It used the five different monthly time series data such as BOD, COD, T-N and EC of the nine of Gumbo River measurement points from Jan. 1998 to Dec. 2006. Water pollution is serious at Gumbo2 and Palgeo stream measurement points. BOD, COD, T-N and EC data are analyzed with the exponential smoothing model and the trend is forecasted until Dec. 2009.

  • PDF

Assessment of Degree of Naturalness of Vegetation on the Riverine Wetland (하천습지의 식생학적 자연도 평가)

  • Chun, Seung-Hoon
    • Journal of Environmental Impact Assessment
    • /
    • v.20 no.1
    • /
    • pp.1-11
    • /
    • 2011
  • This study was carried out to suggest the baseline data necessary for vegetation restoration at riverine wetland within stream corridor. We used the prevalence index for wetland assessment by applying the method of weighted averages with index values based on five hydrophyte indicator status as defined by estimated probability occurred in wetland. We selected near nature and urbanized reach of Gap and Yanghwa streams as experimental site. Although two sites have some different disturbance and characteristics of watershed, they showed that similarity of vegetation community including three dominant species - Salix koreensis, Phragmites communis, Miscanthus sacchariflorus - was very high. But in case of Yanghwa stream, various kinds of emergent plants along wetted condition were distinctly occurred, resulted from difference of hydrological regime and substrate, etc. Degree of naturalness of vegetation at the sampled areas indicated that near nature area of Gap stream and all area of Yanghwa stream were fitted as riverine wetland, while urbanized area of Gap stream has changed into upland condition. In conclusion assessment system using prevalence index would be considered an effective method for evaluating of natural states of riverine wetland, but further integrated consideration of physical, hydrological, and biological factors of stream process, and also with considering the difference between those qualitative data of vegetation community.

Modeling Transverse Velocity Profile in Natural Streams (자연하천의 유속 횡분포 모델링)

  • Seo, Il-Won;Baek, Gyeong-O
    • Journal of Korea Water Resources Association
    • /
    • v.32 no.5
    • /
    • pp.593-601
    • /
    • 1999
  • The knowledge about structure of the velocity in the stream IS essential in the investigation of stream meandering, erosion and sediment transport, and dispersion of pollutants in the stream. In this study, theoretical velocity profile model in which transverse profile of the longitudinal velocity in the stream can be predicted using stream hydraulic data was developed. The proposed model was tested with the measured velocity data of the Nakdong river. The result shows that the numerical model simulates properly the general shalxc of the measured velocity profiles. The simulated profiles agree well with measurements, especially in the aspects of skewness and flatness.atness.

  • PDF

An Investigation of Synoptic Condition for Clear-Air Turbulence (CAT) Events Occurred over South Korea (한국에서 발생한 청천난류 사례에서 나타나는 종관규모 대기상태에 대한 연구)

  • Min, Jae-Sik;Chun, Hye-Yeong;Kim, Jung-Hoon
    • Atmosphere
    • /
    • v.21 no.1
    • /
    • pp.69-83
    • /
    • 2011
  • The synoptic condition of clear-air turbulence (CAT) events occurred over South Korea is investigated, using the Regional Data Assimilation and Prediction System (RDAPS) data obtained from the Korea Meteorological Agency (KMA) and pilot reports (PIREPs) collected by Korea Aviation Meteorological Agency (KAMA) from 1 Dec. 2003 to 30 Nov. 2008. Throughout the years, strong subtropical jet stream exists over the South Korea, and the CAT events frequently occur in the upper-level frontal zone and subtropical jet stream regions where strong vertical wind shears locate. The probability of the moderate or greater (MOG)-level turbulence occurrence is higher in wintertime than in summertime, and high probability region is shifted northward across the jet stream in wintertime. We categorize the CAT events into three types according to their generation mechanisms: i) upper-level front and jet stream, ii) anticyclonically sheared and curved flows, and iii) breaking of mountain waves. Among 240 MOG-level CAT events reported during 2003-2008, 103 cases are related to jet stream while 73 cases and 25 cases are related to the anticyclonic shear flow and breaking of mountain wave, respectively.

Development of a Hybrid Watershed Model STREAM: Test Application of the Model (복합형 유역모델 STREAM의 개발(II): 모델의 시험 적용)

  • Cho, Hong-Lae;Jeong, Euisang;Koo, Bhon Kyoung
    • Journal of Korean Society on Water Environment
    • /
    • v.31 no.5
    • /
    • pp.507-522
    • /
    • 2015
  • In this study, some of the model verification results of STREAM (Spatio-Temporal River-basin Ecohydrology Analysis Model), a newly-developed hybrid watershed model, are presented for the runoff processes of nonpoint source pollution. For verification study of STREAM, the model was applied to a test watershed and a sensitivity analysis was also carried out for selected parameters. STREAM was applied to the Mankyung River Watershed to review the applicability of the model in the course of model calibration and validation against the stream flow discharge, suspended sediment discharge and some water quality items (TOC, TN, TP) measured at the watershed outlet. The model setup, simulation and data I/O modules worked as designed and both of the calibration and validation results showed good agreement between the simulated and the measured data sets: NSE over 0.7 and $R^2$ greater than 0.8. The simulation results also include the spatial distribution of runoff processes and watershed mass balance at the watershed scale. Additionally, the irrigation process of the model was examined in detail at reservoirs and paddy fields.