• Title/Summary/Keyword: indexing structures

Search Result 80, Processing Time 0.021 seconds

Fast Hilbert R-tree Bulk-loading Scheme using GPGPU (GPGPU를 이용한 Hilbert R-tree 벌크로딩 고속화 기법)

  • Yang, Sidong;Choi, Wonik
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.792-798
    • /
    • 2014
  • In spatial databases, R-tree is one of the most widely used indexing structures and many variants have been proposed for its performance improvement. Among these variants, Hilbert R-tree is a representative method using Hilbert curve to process large amounts of data without high cost split techniques to construct the R-tree. This Hilbert R-tree, however, is hardly applicable to large-scale applications in practice mainly due to high pre-processing costs and slow bulk-load time. To overcome the limitations of Hilbert R-tree, we propose a novel approach for parallelizing Hilbert mapping and thus accelerating bulk-loading of Hilbert R-tree on GPU memory. Hilbert R-tree based on GPU improves bulk-loading performance by applying the inversed-cell method and exploiting parallelism for packing the R-tree structure. Our experimental results show that the proposed scheme is up to 45 times faster compared to the traditional CPU-based bulk-loading schemes.

A Swapping Red-black Tree for Wear-leveling of Non-volatile Memory (비휘발성 메모리의 마모도 평준화를 위한 레드블랙 트리)

  • Jeong, Minseong;Lee, Eunji
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.139-144
    • /
    • 2019
  • For recent decades, Non-volatile Memory (NVM) technologies have been drawing a high attention both in industry and academia due to its high density and short latency comparable to that of DRAM. However, NVM devices has write endurance problem and thus the current data structures that have been built around DRAM-specific features including unlimited program cycles is inadequate for NVM, reducing the device lifetime significantly. In this paper, we revisit a red-black tree extensively adopted for data indexing across a wide range of applications, and make it to better fit for NVM. Specifically, we observe that the conventional red-black tree wears out the specific location of memory because of its rebalancing operation to ensure fast access time over a whole dataset. However, this rebalancing operation frequently updates the long-lived nodes, which leads to the skewed wear out across the NVM cells. To resolve this problem, we present a new swapping wear-leveling red-black tree that periodically moves data in the worn-out node into the young node. The performance study with real-world traces demonstrates the proposed red-black tree reduces the standard deviation of the write count across nodes by up to 12.5%.

A Design and Implementation of Dynamic Hybrid P2P System with Hierarchical Group Management and Maintenance of Reliability (계층적 그룹관리와 신뢰성을 위한 동적인 변형 P2P 시스템 설계 및 구현)

  • Lee, Seok-Hee;Cho, Sang;Kim, Sung-Yeol
    • The KIPS Transactions:PartD
    • /
    • v.11D no.4
    • /
    • pp.975-982
    • /
    • 2004
  • In current P2P concept, pure P2P and Hybrid P2P structures are used commonly. Gnutella and Ktella are forms of pure P2P. and forms of Hybrid P2P are innumerable. File searching models exist in these models. These models provide group management for file sharing, searching and indexing. The general file sharing model is good at maintaining connectivity. However, it is defective in group management. Therefore, this study approaches hierarchical structure in file sharing models through routing technique and backup system. This system was designed so that the user was able to maintain group efficiency and connection reliability in large-scale network.

Design and Performance Evaluation of an Indexing Method for Partial String Searches (문자열 부분검색을 위한 색인기법의 설계 및 성능평가)

  • Gang, Seung-Heon;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1458-1467
    • /
    • 1999
  • Existing index structures such as extendable hashing and B+-tree do not support partial string searches perfectly. The inverted file method and the signature file method that are used in the web retrieval engine also have problems that they do not provide partial string searches and suffer from serious retrieval performance degradation respectively. In this paper, we propose an efficient index method that supports partial string searches and achieves good retrieval performance. The proposed index method is based on the Inverted file structure. It constructs the index file with patterns that result from dividing terms by two syllables to support partial string searches. We analyze the characteristics of our proposed method through simulation experiments using wide range of parameter values. We analyze the derive analytic performance evaluation models of the existing inverted file method, signature file method and the proposed index method in terms of retrieval time and storage overhead. We show through performance comparison based on analytic models that the proposed method significantly improves retrieval performance over the existing method.

  • PDF

TPKDB-tree : An Index Structure for Efficient Retrieval of Future Positions of Moving Objects (TPKDB 트리 : 이동 객체의 효과적인 미래 위치 검색을 위한 색인구조)

  • Seo Dong Min;Bok Kyoung Soo;Yoo Jae Soo;Lee Byoung Yup
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.624-640
    • /
    • 2004
  • Recently, with the rapid development of location-based techniques, index structures to efficiently manage moving objects have been required. In this paper, we propose a new spatio-temporal index structure that supports a future position retrieval and minimizes a update cost. The proposed index structure combines an assistant index structure that directly accesses current positions of moving objects with KDB-tree that is a space partitioning access method. The internal node in our proposed index structure keeps time parameters in order to support the future position retrieval and to minimize a update cost. Moreover, we propose new update and split methods to maximize the space utilization and the search performance. We perform various experiments to show that our proposed index structure outperforms the existing index structure.

Effective Streaming of XML Data for Wireless Broadcasting (무선 방송을 위한 효과적인 XML 스트리밍)

  • Park, Jun-Pyo;Park, Chang-Sup;Chung, Yon-Dohn
    • Journal of KIISE:Databases
    • /
    • v.36 no.1
    • /
    • pp.50-62
    • /
    • 2009
  • In wireless and mobile environments, data broadcasting is recognized as an effective way for data dissemination due to its benefits to bandwidth efficiency, energy-efficiency, and scalability. In this paper, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of the mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data which enable energy and latency-efficient broadcast of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains tag name, attributes, and text content of an element as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the wireless stream in a shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing over the stream in the mobile clients. Through extensive performance experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

Hilbert Cube for Spatio-Temporal Data Warehouses (시공간 데이타웨어하우스를 위한 힐버트큐브)

  • 최원익;이석호
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.451-463
    • /
    • 2003
  • Recently, there have been various research efforts to develop strategies for accelerating OLAP operations on huge amounts of spatio-temporal data. Most of the work is based on multi-tree structures which consist of a single R-tree variant for spatial dimension and numerous B-trees for temporal dimension. The multi~tree based frameworks, however, are hardly applicable to spatio-temporal OLAP in practice, due mainly to high management cost and low query efficiency. To overcome the limitations of such multi-tree based frameworks, we propose a new approach called Hilbert Cube(H-Cube), which employs fractals in order to impose a total-order on cells. In addition, the H-Cube takes advantage of the traditional Prefix-sum approach to improve Query efficiency significantly. The H-Cube partitions an embedding space into a set of cells which are clustered on disk by Hilbert ordering, and then composes a cube by arranging the grid cells in a chronological order. The H-Cube refines cells adaptively to handle regional data skew, which may change its locations over time. The H-Cube is an adaptive, total-ordered and prefix-summed cube for spatio-temporal data warehouses. Our approach focuses on indexing dynamic point objects in static spatial dimensions. Through the extensive performance studies, we observed that The H-Cube consumed at most 20% of the space required by multi-tree based frameworks, and achieved higher query performance compared with multi-tree structures.

SOM-Based $R^{*}-Tree$ for Similarity Retrieval (자기 조직화 맵 기반 유사 검색 시스템)

  • O, Chang-Yun;Im, Dong-Ju;O, Gun-Seok;Bae, Sang-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.507-512
    • /
    • 2001
  • Feature-based similarity has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects. the performance of conventional multidimensional data structures tends to deteriorate as the number of dimensions of feature vectors increase. The $R^{*}-Tree$ is the most successful variant of the R-Tree. In this paper, we propose a SOM-based $R^{*}-Tree$ as a new indexing method for high-dimensional feature vectors. The SOM-based $R^{*}-Tree$ combines SOM and $R^{*}-Tree$ to achieve search performance more scalable to high-dimensionalties. Self-Organizingf Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two-dimensional space. The map is called a topological feature map, and preserves the mutual relationships (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. We experimentally compare the retrieval time cost of a SOM-based $R^{*}-Tree$ with of an SOM and $R^{*}-Tree$ using color feature vectors extracted from 40,000 images. The results show that the SOM-based $R^{*}-Tree$ outperform both the SOM and $R^{*}-Tree$ due to reduction of the number of nodes to build $R^{*}-Tree$ and retrieval time cost.

  • PDF

Index-based Searching on Timestamped Event Sequences (타임스탬프를 갖는 이벤트 시퀀스의 인덱스 기반 검색)

  • 박상현;원정임;윤지희;김상욱
    • Journal of KIISE:Databases
    • /
    • v.31 no.5
    • /
    • pp.468-478
    • /
    • 2004
  • It is essential in various application areas of data mining and bioinformatics to effectively retrieve the occurrences of interesting patterns from sequence databases. For example, let's consider a network event management system that records the types and timestamp values of events occurred in a specific network component(ex. router). The typical query to find out the temporal casual relationships among the network events is as fellows: 'Find all occurrences of CiscoDCDLinkUp that are fellowed by MLMStatusUP that are subsequently followed by TCPConnectionClose, under the constraint that the interval between the first two events is not larger than 20 seconds, and the interval between the first and third events is not larger than 40 secondsTCPConnectionClose. This paper proposes an indexing method that enables to efficiently answer such a query. Unlike the previous methods that rely on inefficient sequential scan methods or data structures not easily supported by DBMSs, the proposed method uses a multi-dimensional spatial index, which is proven to be efficient both in storage and search, to find the answers quickly without false dismissals. Given a sliding window W, the input to a multi-dimensional spatial index is a n-dimensional vector whose i-th element is the interval between the first event of W and the first occurrence of the event type Ei in W. Here, n is the number of event types that can be occurred in the system of interest. The problem of‘dimensionality curse’may happen when n is large. Therefore, we use the dimension selection or event type grouping to avoid this problem. The experimental results reveal that our proposed technique can be a few orders of magnitude faster than the sequential scan and ISO-Depth index methods.hods.

Analysis of Authority Control System in Collecting Repository -from the case of Archival Management System in Korea Democracy Foundation- (수집형 기록관의 전거제어시스템 분석 - 민주화운동기념사업회 사료관리시스템의 사례를 중심으로 -)

  • Lee, Hyun-Jeong
    • The Korean Journal of Archival Studies
    • /
    • no.13
    • /
    • pp.91-134
    • /
    • 2006
  • In general, personally collected archives, manuscripts, are physically badly conditioned and also contextual of the archives and information on the history of production is mostly collected partly in the manuscripts. Therefore they need to control the name of the producers on the archives collected in various ways effectively and accumulate provenance information which is the key element when understanding the production background in the collecting repository. Here, the authority control and provenance information management must be organized from the beginning of acquisition and this means to collect necessary information considering control process of acquisition as well. This thesis is for verifying the necessity of the authority control in collecting repository and accumulation of the provenance information and for suggesting the things to be considered as collecting Archival authority system. For all these, this thesis shows that it has checked out the necessity of the authority control in archival management and archival authority control and researched the standard of archival authority control, work process and accumulation process. Archival provenance information management and authority control in the archival authority control system are organized through the whole steps of the archival management starting from the lead file to the name of the producers at archival registration and archival description at acquisition. And a lot of information is registered and described at the proper point of time and finally all the information including authority control which controls the Heading in the authority management must be organized to use them as an intellectual management of archives and Finding Aids. The features of the Archival authority system are as follows; first of all, Authority file type which is necessary at the archival authority control of democracy movement is made up of the name of the group, person, affair and terminology(subject name). Second of all, basic record structures and description elements in authority collection of Korea Democracy Foundation Archives apply in the paragraph 1 of ISAAR(CPF) adding some necessary elements and details of description rule such as spacing words and using the periods apply in the paragraph 4 of KCR coping with the features of the archival management system. And also the way of input on the authority record is based on EAC(Encoded Archival Context). Third of all, it made users approach to the sources which they want more easily by connecting the authority terms systemically making it possible to connect the relative terms with up and down words, before and after words variously and concretely expanding the term relations rather than earlier traditional authority system which is usually expressed only with relative words (see also). So the authority control of archival management system can effectively collect and manage the function of various and multiple groups and information on main activities as well as its own function which is controlling the Heading and express the multiple and intermediary relationship between archives and producers or between producers and it also provides them with expanded Record information service which satisfies user's various requests through Indexing service. Finally applying in this international standard ISAAR(CPF) through the instance of the authority management like this, it can be referred to making Archival authority system in Collecting repository hereafter by reorganizing the description elements into appropriate formations and setting up the authority file type which is to be managed properly for every service.