• Title/Summary/Keyword: Distributed Data

Search Result 5,976, Processing Time 0.038 seconds

Matrix-based Filtering and Load-balancing Algorithm for Efficient Similarity Join Query Processing in Distributed Computing Environment (분산 컴퓨팅 환경에서 효율적인 유사 조인 질의 처리를 위한 행렬 기반 필터링 및 부하 분산 알고리즘)

  • Yang, Hyeon-Sik;Jang, Miyoung;Chang, Jae-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.7
    • /
    • pp.667-680
    • /
    • 2016
  • As distributed computing platforms like Hadoop MapReduce have been developed, it is necessary to perform the conventional query processing techniques, which have been executed in a single computing machine, in distributed computing environments efficiently. Especially, studies on similarity join query processing in distributed computing environments have been done where similarity join means retrieving all data pairs with high similarity between given two data sets. But the existing similarity join query processing schemes for distributed computing environments have a problem of skewed computing load balance between clusters because they consider only the data transmission cost. In this paper, we propose Matrix-based Load-balancing Algorithm for efficient similarity join query processing in distributed computing environment. In order to uniform load balancing of clusters, the proposed algorithm estimates expected computing cost by using matrix and generates partitions based on the estimated cost. In addition, it can reduce computing loads by filtering out data which are not used in query processing in clusters. Finally, it is shown from our performance evaluation that the proposed algorithm is better on query processing performance than the existing one.

Design and Implementation of Distributed Object Framework Supporting Audio/Video Streaming (오디오/비디오 스트리밍을 지원하는 분산 객체 프레임 워크 설계 및 구현)

  • Ban, Deok-Hun;Kim, Dong-Seong;Park, Yeon-Sang;Lee, Heon-Ju
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.5 no.4
    • /
    • pp.440-448
    • /
    • 1999
  • 본 논문은 객체지향형 분산처리 환경 하에서 오디오나 비디오 등과 같은 실시간(real-time) 스트림(stream) 데이타를 처리하는 데 필요한 소프트웨어 기반구조를 설계하고 구현한 내용을 기술한다. 본 논문에서 제시한 DAViS(Distributed Object Framework supporting Audio/Video Streaming)는, 오디오/비디오 데이타의 처리와 관련된 여러 소프트웨어 구성요소들을 분산객체로 추상화하고, 그 객체들간의 제어정보 교환경로와 오디오/비디오 데이타 전송경로를 서로 분리하여 처리한다. 분산응용프로그램 작성자는 DAViS에서 제공하는 서비스들을 이용하여, 기존의 분산프로그래밍 환경이 제공하는 것과 동일한 수준에서 오디오/비디오 데이타에 대한 처리를 표현할 수 있다. DAViS는, 새로운 형식의 오디오/비디오 데이타를 처리하는 부분을 손쉽게 통합하고, 하부 네트워크의 전송기술이나 컴퓨터시스템 관련 기술의 진보를 신속하고 자연스럽게 수용할 수 있도록 하는 유연한 구조를 가지고 있다. Abstract This paper describes the design and implementation of software framework which supports the processing of real-time stream data like audio and video in distributed object-oriented computing environment. DAViS(Distributed Object Framework supporting Audio/Video Streaming), proposed in this paper, abstracts software components concerning the processing of audio/video data as distributed objects and separates the transmission path of data between them from that of control information. Based on DAViS, distributed applications can be written in the same abstract level as is provided by the existing distributed environment in handling audio/video data. DAViS has a flexible internal structure enough to easily incorporate new types of audio/video data and to rapidly accommodate the progress of underlying network and computer system technology with very little modifications.

A study of data harvest in distributed sensor networks (분산 센서 네트워크에서 데이터 수집에 대한 연구)

  • Park, Sangjoon;Lee, Jongchan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.5
    • /
    • pp.3421-3425
    • /
    • 2015
  • In sensor networks, sensor nodes are usually distributed to manage the networks in continuous unique area, however as by the network property nodes can be located in several areas. The data gathering of distributed nodes to several areas can be different with current continuous area. Hence, the distributed networks can be differently managed to the current continuous networks. In this paper, we describe the data gathering of sensor nodes in distributed sensor areas. It is possible that sensor nodes cannot instantly connect the mobile sink, and the node operation should be considered. The real time data sending to the instant connection scheme of mobile sink can be implemented, but the property of mobile sink should be considered for the sink connection of distributed areas. In this paper, we analyze the proposed scheme by the simulation results. The simulation results show that the overall lifetime to the periodic data gathering method is longer than the threshold method.

Data Integration for DW Construction

  • Yongmoo Suh;Jung, Chul-Yong
    • The Journal of Information Technology and Database
    • /
    • v.4 no.2
    • /
    • pp.79-95
    • /
    • 1998
  • Useful data being distributed over several systems, we have a problem in accessing and utilizing them. Recognizing this problem, researchers have proposed two concepts as solutions to the problem, multidatabase and data warehouse. The one provides a virtual view over the distributed data, and the latter is a materialized view of it. Recently, more attention has been paid to the latter, which is a single of distributed database, collected along a time dimension. So, the major issues in building a data warehouse are 1) how to define a global schema for the data warehouse, 2) how to capture changes from local databases, and 3) how to represent time-varying values of data item. This paper presents an integrated approach to these issues, borrowing the research results from such areas as multidatabase, active databases and temporal databases.

Reinforcement learning multi-agent using unsupervised learning in a distributed cloud environment

  • Gu, Seo-Yeon;Moon, Seok-Jae;Park, Byung-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.192-198
    • /
    • 2022
  • Companies are building and utilizing their own data analysis systems according to business characteristics in the distributed cloud. However, as businesses and data types become more complex and diverse, the demand for more efficient analytics has increased. In response to these demands, in this paper, we propose an unsupervised learning-based data analysis agent to which reinforcement learning is applied for effective data analysis. The proposal agent consists of reinforcement learning processing manager and unsupervised learning manager modules. These two modules configure an agent with k-means clustering on multiple nodes and then perform distributed training on multiple data sets. This enables data analysis in a relatively short time compared to conventional systems that perform analysis of large-scale data in one batch.

A Development of Proactive Application Service Engine Based on the Distributed Object Group Framework (분산객체그룹프레임워크 기반의 프로액티브 응용서비스엔진 개발)

  • Shin, Chang-Sun;Seo, Jong-Seong
    • Journal of Internet Computing and Services
    • /
    • v.11 no.1
    • /
    • pp.153-165
    • /
    • 2010
  • In this paper, we proposed a Proactive Application Service Engine (PASE) supporting tailor-made distributed application services based on the Distributed Object Group Framework (DOGF) efficiently managing distributed objects, in the viewpoint of distributed application, composed application on network. The PASE consists of 3 layers which are the physical layer, the middleware layer, and the application layer. With the supporting services of the PASE, the grouping service manages the data gathered from H/W devices and the object's properties for application by user's request as a group. And the security service manages the access of gathered data and the object according to user's right. The data filtering service executes the filtering function to provide application with gathered data. The statistics service analysis past data. The diagnostic service diagnoses a present condition by using the gathered data. And the prediction service predicts a future's status based on the statistics service and the diagnostic service. For verifying the executability of the PASE's services, we applied to a greenhouse automatic control application in ubiquitous agriculture field.

Distributed System Architecture Modeling of a Performance Monitoring and Reporting Tool (분산 시스템의 성능 모니터링과 레포팅 툴의 아키텍처 모델링)

  • Kim, Ki;Choi, Eun-Mi
    • Journal of the Korea Society for Simulation
    • /
    • v.12 no.3
    • /
    • pp.69-81
    • /
    • 2003
  • To manage a cluster of distributed server systems, a number of management aspects should be considered in terms of configuration management, fault management, performance management, and user management. System performance monitoring and reporting take an important role for performance and fault management. In this paper, we present distributed system architecture modeling of a performance monitoring and reporting tool. Modeling architecture of four subsystems are introduced: node agent, data collection, performance management & report, and DB schema. The performance-related information collected from distributed servers are categorized into performance counters, event data for system status changes, service quality, and system configuration data. In order to analyze those performance information, we use a number of ways to evaluate data corelation. By using some results from a real site of a company and from simulation of artificial workload, we show the example of performance collection and analysis. Since our report tool detects system fault or node component failure and analyzes performances through resource usage and service quality, we are able to provide information for server load balancing, in short term view, and the cause of system faults and decision for system scale-out and scale-up, in long term view.

  • PDF

Segmentation and Classification of Lidar data

  • Tseng, Yi-Hsing;Wang, Miao
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.153-155
    • /
    • 2003
  • Laser scanning has become a viable technique for the collection of a large amount of accurate 3D point data densely distributed on the scanned object surface. The inherent 3D nature of the sub-randomly distributed point cloud provides abundant spatial information. To explore valuable spatial information from laser scanned data becomes an active research topic, for instance extracting digital elevation model, building models, and vegetation volumes. The sub-randomly distributed point cloud should be segmented and classified before the extraction of spatial information. This paper investigates some exist segmentation methods, and then proposes an octree-based split-and-merge segmentation method to divide lidar data into clusters belonging to 3D planes. Therefore, the classification of lidar data can be performed based on the derived attributes of extracted 3D planes. The test results of both ground and airborne lidar data show the potential of applying this method to extract spatial features from lidar data.

  • PDF

A holistic distributed clustering algorithm based on sensor network (센서 네트워크 기반의 홀리스틱 분산 클러스터링 알고리즘)

  • Chen Ping;Kee-Wook Rim;Nam Ji-Yeun;Lee KyungOh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.874-877
    • /
    • 2008
  • Nowadays the existing data processing systems can only support some simple query for sensor network. It is increasingly important to process the vast data streams in sensor network, and achieve effective acknowledges for users. In this paper, we propose a holistic distributed k-means algorithm for sensor network. In order to verify the effectiveness of this method, we compare it with central k-means algorithm to process the data streams in sensor network. From the evaluation experiments, we can verify that the proposed algorithm is highly capable of processing vast data stream with less computation time. This algorithm prefers to cluster the data streams at the distributed nodes, and therefore it largely reduces redundant data communications compared to the central processing algorithm.

A Study on the Web-based Distributed Design Application in the Preliminary Ship Design

  • Park, Chang-Kyu
    • Journal of information and communication convergence engineering
    • /
    • v.8 no.5
    • /
    • pp.473-478
    • /
    • 2010
  • Today's engineering design is carried out in a distributed fashion geographically or physically. This places new requirements on the computational environments such as efficient integration and collaboration. With the advances of the Internet and Network environment recently, many researches have been proposed and at the same time, Web-based distributed design gives to a new paradigm in design and manufacturing fields. That is, Web-based technologies lead to reduce the product development times and to ensure a competitive product in order to exchange and interact of real-time design information that integrates the distributed design environment between departments as well as companies via Internet and Web. So, an efficient data communication for design information sharing is the foundation for collaborative systems in the distributed environment. Design data communication techniques such as CORBA, DCOM and RMI have been considered in the existing research but they have some problems that are limitations of interoperability and firewall problems in the Web. Therefore, this paper presents a Web-based distributed design application where distributed design information resources are integrated and exchanged using Web Services for supporting XML and HTTP without the interoperability and firewall problems through the 330K VLCC case.