• Title/Summary/Keyword: 분산 데이터베이스

Search Result 607, Processing Time 0.024 seconds

Uniform Load Distribution Using Sampling-Based Cost Estimation in Parallel Join (병렬 조인에서 샘플링 기반 비용 예측 기법을 이용한 균등 부하 분산)

  • Park, Ung-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1468-1480
    • /
    • 1999
  • In database systems, join operations are the most complex and time consuming ones which limit performance of such system. Many parallel join algorithms have been proposed for the systems. However, they did not consider data skew, such as attribute value skew (AVS) and join product skew (JPS). In the skewness environments, performance of framework for a uniform load distribution and an efficient parallel join algorithm using the framework to handle AVS and JPS. In our algorithm, we estimate data distributions of input and output relations of join operations using the sampling methodology and evaluate join cost for the estimated data distributions. Finally, using the histogram equalization method we distribute data among nodes to achieve good load balancing among nodes in the local joining phase. For performance comparison, we present simulation model of our algorithm and other join algorithms and present the result of some simulation experiments. The results indicate that our algorithm outperforms other algorithms in the skewed case.

  • PDF

Using a Greedy Algorithm for the Improvement of a MapReduce, Theta join, M-Bucket-I Heuristic (그리디 알고리즘을 이용한 맵리듀스 세타조인 M-Bucket-I 휴리스틱의 개선)

  • Kim, Wooyeol;Shim, Kyuseok
    • Journal of KIISE
    • /
    • v.43 no.2
    • /
    • pp.229-236
    • /
    • 2016
  • Theta join is one of the essential and important types of queries in database systems. As the amount of data needs to be processed increases, processing theta joins with a single machine becomes impractical. Therefore, theta join algorithms using distributed computing frameworks have been studied widely. Although one of the state-of-the-art theta-join algorithms uses M-Bucket-I heuristic, it is hard to use since running time of M-Bucket-I heuristic, which computes a mapping from a record to a reducer (i.e., reducer mapping), is O(n) where n is the size of input data. In this paper, we propose MBI-I algorithm which reduces the running time of M-Bucket-I heuristic to $O(r_{max}log\;n)$ and gives the same result as M-Bucket-I heuristic does. We also conducted several experiments to show algorithm and confirmed that our algorithm can improve the performance of a theta join by 10%.

Challenge-Response Based Secure RFID Authentication Protocol for Distributed Database Environment (분산 데이터베이스 환경에 적합한 Challenge-Response 기반의 안전한 RFID 인증 프로토콜)

  • Rhee Keun-Woo;Oh Dong-Kyu;Kwak Jin;Oh Soo-Hyun;Kim Seung-Joo;Won Dong-Ho
    • The KIPS Transactions:PartC
    • /
    • v.12C no.3 s.99
    • /
    • pp.309-316
    • /
    • 2005
  • Recently, RFID system is a main technology to realize ubiquitous computing environments, but the feature of the RFID system may bring about various privacy problem. So, many kinds of protocols To resolve this problem are researched. In this paper, we analyse the privacy problem of the previous methods and propose more secure and effective authentication protocol to protect user's privacy. Then we prove that the proposed protocol is secure and effective as we compare the proposed protocol with previous methods. The proposed protocol is based on Challenge-Response using one-way hash function and random number. The proposed protocol is secure against replay attack, spoofing attack and so on. In addition, the proposed protocol is proper for distributed database environment.

Scalable scheduling techniques for distributed real-time multimedia database systems (분산 실시간 멀티미디어 데이터베이스 시스템을 위한 신축성있는 스케줄링 기법)

  • Kim, Jin-Hwan
    • The KIPS Transactions:PartA
    • /
    • v.9A no.1
    • /
    • pp.9-18
    • /
    • 2002
  • In this paper, we propose scalable scheduling techniques based on EDF to efficiently integrate hard real-time and multimedia soft real-time tasks in the distributed real-time multimedia database system. Hard tasks are guarangteed based on worst case execution times, whereas multimedia soft tasks are served based on mean execution times. This paper describes a served-based scheme for partitioning the CPU bandwidth among different task classes that coexist in the same system. To handle the problem of class overloads characterized by varying number of tasks and varying task arrival rates, thus scheme shows how to adjust the fraction of the CPU bandwidth assigned to each class. This scheme fixes the maximum time that each hard task can execute in the period of the server, whereas it can dynamically change the bandwidth reserved to each multimedia task. The proposed method is capable of minimizing the mean tardiness of multimedia tasks, without jeopardizing the schedulability of the hard tasks. The performance of this scheduling method is compared with that of similar mechanisms through simulation experiments.

Petri Net Model for Moving Objects Database (이동물체 데이터베이스의 페트리 넷 모형)

  • 임재걸;이계영
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.3
    • /
    • pp.1-10
    • /
    • 2004
  • Quite many papers about moving objects database (MOD) have been Published. Most of them are concerning of improving efficiency of updating policy, but none of them talks about verification of system's requirements. No matter how efficient updating policy is employed, a system designer still has to verify if the MOD satisfies user's requirement at the beginning of system lifestyle. For example, if a MOD serves n moving objects, then the designer must verity if it can update information for n moving objects and provide new information to moving objects within specified time limit. For the purpose of requirement analysis of MODs, we build a Petri net model for MOD using Design/CPN, and then we show how to verify whether the MOD satisfies user's requirements by simulation. The contribution of this paper is providing a simulation model for requirement analysis of MOD for the first time. The model is expansion of our previous fuzzy-timing Petri net model. The distance-based updating policy, and distributed database management system are reflected in this model and system analysis for moving objects is considered. It is built in Design/CPN so that the simulation can be automatically performed. The application of our model is not limited to requirement analysis, it is useful to study other MOD design issues, such as the trade-offs between update cost and information accuracy, and between the tim interval needed for updating database and MOD system resources, etc.

A synchronous/asynchronous hybrid parallel method for some eigenvalue problems on distributed systems

  • 박필성
    • Proceedings of the Korean Society of Computational and Applied Mathematics Conference
    • /
    • 2003.09a
    • /
    • pp.11-11
    • /
    • 2003
  • 오늘날 단일 슈퍼컴퓨터로는 처리가 불가능한 거대한 문제들의 해법이 시도되고 있는데, 이들은 지리적으로 분산된 슈퍼컴퓨터, 데이터베이스, 과학장비 및 디스플레이 장치 등을 초고속 통신망으로 연결한 GRID 환경에서 효과적으로 실행시킬 수 있다. GRID는 1990년대 중반 과학 및 공학용 분산 컴퓨팅의 연구 과정에서 등장한 것으로, 점차 응용분야가 넓어지고 있다. 그러나 GRID 같은 분산 환경은 기존의 단일 병렬 시스템과는 많은 점에서 다르며 이전의 기술들을 그대로 적용하기에는 무리가 있다. 기존 병렬 시스템에서는 주로 동기 알고리즘(synchronous algorithm)이 사용되는데, 직렬 연산과 같은 결과를 얻기 위해 동기화(synchronization)가 필요하며, 부하 균형이 필수적이다. 그러나 부하 균형은 이질 클러스터(heterogeneous cluster)처럼 프로세서들의 성능이 서로 다르거나, 지리적으로 분산된 계산자원을 사용하는 GRID 환경에서는 이기종의 문제뿐 아니라 네트워크를 통한 메시지의 전송 지연 등으로 유휴시간이 길어질 수밖에 없다. 이처럼 동기화의 필요성에 의한 연산의 지연을 해결하는 하나의 방안으로 비동기 반복법(asynchronous iteration)이 나왔으며, 지금도 활발히 연구되고 있다. 이는 알고리즘의 동기점을 가능한 한 제거함으로써 빠른 프로세서의 유휴 시간을 줄이는 것이 목적이다. 즉 비동기 알고리즘에서는, 각 프로세서는 다른 프로세서로부터 갱신된 데이터가 올 때까지 기다리지 않고 계속 다음 작업을 수행해 나간다. 따라서 동시에 갱신된 데이터를 교환한 후 다음 단계로 진행하는 동기 알고리즘에 비해, 미처 갱신되지 않은 데이터를 사용하는 경우가 많으므로 전체적으로는 연산량 대비의 수렴 속도는 느릴 수 있다 그러나 각 프로세서는 거의 유휴 시간이 없이 연산을 수행하므로 wall clock time은 동기 알고리즘보다 적게 걸리며, 때로는 50%까지 빠른 결과도 보고되고 있다 그러나 현재까지의 연구는 모두 어떤 수렴조건을 만족하는 선형 시스템의 해법에 국한되어 있으며 비교적 구현하기 쉬운 공유 메모리 시스템에서의 연구만 보고되어 있다. 본 연구에서는 행렬의 주요 고유쌍을 구하는 데 있어 비동기 반복법의 적용 가능성을 타진하기 위해 우선 이론적으로 단순한 멱승법을 사용하여 실험하였고 그 결과 순수한 비동기 반복법은 수렴하기 어렵다는 결론을 얻었다 그리하여 동기 알고리즘에 비동기적 요소를 추가한 혼합 병렬 알고리즘을 제안하고, MPI(Message Passing Interface)를 사용하여 수원대학교의 Hydra cluster에서 구현하였다. 그 결과 특정 노드의 성능이 다른 것에 비해 현저하게 떨어질 때 전체적인 알고리즘의 수렴 속도가 떨어지는 것을 상당히 완화할 수 있음이 밝혀졌다.

  • PDF

A Efficient Cloaking Region Creation Scheme using Hilbert Curves in Distributed Grid Environment (분산 그리드 환경에서 힐버트 커브를 이용한 효율적인 Cloaking 영역 설정 기법)

  • Lee, Ah-Reum;Um, Jung-Ho;Chang, Jae-Woo
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.115-126
    • /
    • 2009
  • Recent development in wireless communication and mobile positioning technologies makes Location-Based Services (LBSs) popular. However, because, in the LBSs, users request a query to database servers by using their exact locations, the location information of the users can be misused by adversaries. Therefore, a mechanism for users' privacy protection is required for the safe use of LBSs by mobile users. For this, we, in this paper, propose a efficient cloaking region creation scheme using Hilbert curves in distributed grid environment, so as to protect users' privacy in LBSs. The proposed scheme generates a minimum cloaking region by analyzing the characteristic of a Hilbert curve and computing the Hilbert curve values of neighboring cells based on it, so that we may create a cloaking region to satisfy K-anonymity. In addition, to reduce network communication cost, we make use of a distributed hash table structure, called Chord. Finally, we show from our performance analysis that the proposed scheme outperforms the existing grid-based cloaking method.

  • PDF

A Study on Distribution Query Conversion Method for Real-time Integrating Retrieval based on TMDR (TMDR 기반의 실시간 통합 검색을 위한 분산질의 변환 기법에 대한 연구)

  • Hwang, Chi-Gon;Shin, Hyo-Young;Jung, Kye-Dong;Choi, Young-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.7
    • /
    • pp.1701-1707
    • /
    • 2010
  • This study is intended for implementing the system environment that can help integrate and retrieve various types of data in real-time by providing semantic interoperability among distributed heterogeneous information systems. The semantic interoperability is made possible by providing a TMDR(Topicmaps Metadata Registry), a set of ontologies. TMDR, which has been made by combining MDR(MetaData Registry) and TopicMaps and storing them in the database, is able to generate distributed query and provide efficient knowledge. MDR is a metadata management technique for distributed data management. TopicMaps is an ontology representation technique that takes into consideration the hierarchy and association for accessing knowledge data. We have created TMDR, a kind of ontology, that is fit for any system and able to detect and resolve semantic conflicts on the level of data and schema. With this system we propose a query-processing technique to integrate and access heterogeneous information sources. Unlike existing retrieval methods this makes possible efficient retrieval and reasoning by providing association focusing on subjects.

Design and Implementation of an Expert Search System Using Academic Data in Big Data Processing Platforms (빅데이터 처리 플랫폼에서 학술 데이터를 사용한 전문가 검색 시스템 설계 및 구현)

  • Choi, Dojin;Kim, Minsoo;Kim, Daeyun;Lee, Seohee;Han, Jinsu;Seo, Indeok;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.3
    • /
    • pp.100-114
    • /
    • 2017
  • Most of the researchers establish research directions to conduct the study of new fields by getting advice from experts or through the papers of experts. The existing academic data search services provide paper information by field but do not provide experts by field. Therefore, users should decide experts by field using the searched papers by themselves. In this paper, we design and implement an expert search system by discipline through big data processing based on papers that have been published in the academic societies. The proposed system utilizes distributed big data storage systems to store and manage large papers. We also discriminate experts and analyze data related to the experts by using distributed big data processing technologies. The processed results are provided through web pages when a user searches for experts. The user can get a lot of helps for the research of a particular field since the proposed system recommends the experts of the corresponding research field.

A Large-Scale Conference Service by Distributed Focus Control Method (분산 포커스 제어 방식에 의한 대규모 컨퍼런스 서비스)

  • Jang, Choonseo
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.7
    • /
    • pp.10-17
    • /
    • 2014
  • The focus maintains and manages conference session in the conference service. Load of focus increases by the number of participants and it becomes the major reason that limits the extendability of the large-scale conference service. In this paper, a new distributed focus control method has been proposed to solve this problem. Load of focus has been distributed to several conference nodes to implement large-scale conference service in this architecture. Conference nodes which have focus function subscribe to conference server's conference information database, and focus hae been allocated dynamically to process participants needs according to total numbers of the conference participants. For this purpose a new conference control event package for focus load control has been suggested. Furthermore the exchange procedure of SIP messages between focuses and participants also been suggested. The performance of the proposed system has been evaluated by simulation.