• Title/Summary/Keyword: tree indexing

Search Result 211, Processing Time 0.028 seconds

Design and Performance Evaluation of an Indexing Method for Partial String Searches (문자열 부분검색을 위한 색인기법의 설계 및 성능평가)

  • Gang, Seung-Heon;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1458-1467
    • /
    • 1999
  • Existing index structures such as extendable hashing and B+-tree do not support partial string searches perfectly. The inverted file method and the signature file method that are used in the web retrieval engine also have problems that they do not provide partial string searches and suffer from serious retrieval performance degradation respectively. In this paper, we propose an efficient index method that supports partial string searches and achieves good retrieval performance. The proposed index method is based on the Inverted file structure. It constructs the index file with patterns that result from dividing terms by two syllables to support partial string searches. We analyze the characteristics of our proposed method through simulation experiments using wide range of parameter values. We analyze the derive analytic performance evaluation models of the existing inverted file method, signature file method and the proposed index method in terms of retrieval time and storage overhead. We show through performance comparison based on analytic models that the proposed method significantly improves retrieval performance over the existing method.

  • PDF

A Data Mining Approach for Selecting Bitmap Join Indices

  • Bellatreche, Ladjel;Missaoui, Rokia;Necir, Hamid;Drias, Habiba
    • Journal of Computing Science and Engineering
    • /
    • v.1 no.2
    • /
    • pp.177-194
    • /
    • 2007
  • Index selection is one of the most important decisions to take in the physical design of relational data warehouses. Indices reduce significantly the cost of processing complex OLAP queries, but require storage cost and induce maintenance overhead. Two main types of indices are available: mono-attribute indices (e.g., B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). To optimize star join queries characterized by joins between a large fact table and multiple dimension tables and selections on dimension tables, bitmap join indices are well adapted. They require less storage cost due to their binary representation. However, selecting these indices is a difficult task due to the exponential number of candidate attributes to be indexed. Most of approaches for index selection follow two main steps: (1) pruning the search space (i.e., reducing the number of candidate attributes) and (2) selecting indices using the pruned search space. In this paper, we first propose a data mining driven approach to prune the search space of bitmap join index selection problem. As opposed to an existing our technique that only uses frequency of attributes in queries as a pruning metric, our technique uses not only frequencies, but also other parameters such as the size of dimension tables involved in the indexing process, size of each dimension tuple, and page size on disk. We then define a greedy algorithm to select bitmap join indices that minimize processing cost and verify storage constraint. Finally, in order to evaluate the efficiency of our approach, we compare it with some existing techniques.

Effective Streaming of XML Data for Wireless Broadcasting (무선 방송을 위한 효과적인 XML 스트리밍)

  • Park, Jun-Pyo;Park, Chang-Sup;Chung, Yon-Dohn
    • Journal of KIISE:Databases
    • /
    • v.36 no.1
    • /
    • pp.50-62
    • /
    • 2009
  • In wireless and mobile environments, data broadcasting is recognized as an effective way for data dissemination due to its benefits to bandwidth efficiency, energy-efficiency, and scalability. In this paper, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of the mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data which enable energy and latency-efficient broadcast of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains tag name, attributes, and text content of an element as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the wireless stream in a shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing over the stream in the mobile clients. Through extensive performance experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

An Index-Building Method for Boundary Matching that Supports Arbitrary Partial Denoising (임의의 부분 노이즈제거를 지원하는 윤곽선 매칭의 색인 구축 방법)

  • Kim, Bum-Soo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1343-1350
    • /
    • 2019
  • Converting boundary images to time-series makes it feasible to perform boundary matching even on a very large image database, which is very important for interactive and fast matching. In recent research, there has been an attempt to perform fast matching considering partial denoising by converting the boundary image into time series. In this paper, to improve performance, we propose an index-building method considering all possible arbitrary denoising parameters for removing arbitrary partial noises. This is a challenging problem since the partial denoising boundary matching must be considered for all possible denoising parameters. We propose an efficient single index-building algorithm by constructing a minimum bounding rectangle(MBR) according to all possible denoising parameters. The results of extensive experiments conducted show that our index-based matching method improves the search performance up to 46.6 ~ 4023.6 times.

A Spatial Split Method for Processing of Region Monitoring Queries (영역 모니터링 질의 처리를 위한 공간 분할 기법)

  • Chung, Jaewoo;Jung, HaRim;Kim, Ung-Mo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.67-76
    • /
    • 2018
  • This paper addresses the problem of efficient processing of region monitoring queries. The centralized methods used for existing region monitoring query processing assumes that the mobile object periodically sends location-updates to the server and the server continues to update the query results. However, a large amount of location updates seriously degrade the system performance. Recently, some distributed methods have been proposed for region monitoring query processing. In the distributed methods, the server allocates to all objects i) a resident domain that is a subspace of the workspace, and ii) a number of nearby query regions. All moving objects send location updates to the server only when they leave the resident domain or cross the boundary of the query region. In order to allocate the resident domain to the moving object along with the nearby query region, we use a query index structure that is constructed by splitting the workspace recursively into equal halves. However, However, the above index structure causes unnecessary division, resulting in deterioration of system performance. In this paper, we propose an adaptive split method to reduce unnecessary splitting. The workspace splitting is dynamically allocated i) considering the spatial relationship between the query region and the resultant subspace, and ii) the distribution of the query region. We proposed an enhanced QR-tree with a new splitting method. Through a set of simulations, we verify the efficiency of the proposed split methods.

An Enhancing Technique for Scan Performance of a Skip List with MVCC (MVCC 지원 스킵 리스트의 범위 탐색 향상 기법)

  • Kim, Leeju;Lee, Eunji
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.5
    • /
    • pp.107-112
    • /
    • 2020
  • Recently, unstructured data is rapidly being produced based on web-based services. NoSQL systems and key value stores that process unstructured data as key and value pairs are widely used in various applications. In this paper, a study was conducted on a skip list used for in-memory data management in an LSM-tree based key value store. The skip list used in the key value store is an insertion-based skip list that does not allow overwriting and processes all changes only by inserting. This behavior can support Multi-Version Concurrency Control (MVCC), which can simultaneously process multiple read/write requests through snapshot isolation. However, since duplicate keys exist in the skip list, the performance significantly degrades due to unnecessary node visits during a list traverse. In particular, serious overhead occurs when a range query or scan operation that collectively searches a specific range of data occurs. This paper proposes a newly designed Stride SkipList to reduce this overhead. The stride skip list additionally maintains an indexing pointer for the last node of the same key to avoid unnecessary node visits. The proposed scheme is implemented using RocksDB's in-memory component, and the performance evaluation shows that the performance of SCAN operation improves by up to 350 times compared to the existing skip list for various workloads.

Edge-based spatial descriptor for content-based Image retrieval (내용 기반 영상 검색을 위한 에지 기반의 공간 기술자)

  • Kim, Nac-Woo;Kim, Tae-Yong;Choi, Jong-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.5 s.305
    • /
    • pp.1-10
    • /
    • 2005
  • Content-based image retrieval systems are being actively investigated owing to their ability to retrieve images based on the actual visual content rather than by manually associated textual descriptions. In this paper, we propose a novel approach for image retrieval based on edge structural features using edge correlogram and color coherence vector. After color vector angle is applied in the pre-processing stage, an image is divided into two image parts (high frequency image and low frequency image). In low frequency image, the global color distribution of smooth pixels is extracted by color coherence vector, thereby incorporating spatial information into the proposed color descriptor. Meanwhile, in high frequency image, the distribution of the gray pairs at an edge is extracted by edge correlogram. Since the proposed algorithm includes the spatial and edge information between colors, it can robustly reduce the effect of the significant change in appearance and shape in image analysis. The proposed method provides a simple and flexible description for the image with complex scene in terms of structural features of the image contents. Experimental evidence suggests that our algorithm outperforms the recently histogram refinement methods for image indexing and retrieval. To index the multidimensional feature vectors, we use R*-tree structure.

An Optimal Design Method for the Multidimensional Nested Attribute Indexes (다차원 중포 속성 색인구조의 최적 설계기법)

  • 이종학
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.2
    • /
    • pp.194-207
    • /
    • 2003
  • This paper presents an optimal design methodology for the multidimensional nested attribute index (MD-NAI) that uses a multidimensional index structure for indexing the nested attributes in object databases. The MD-NAI efficiently supports complex queries involving both nested attributes and class hierarchies, which are not supported by the nested attribute index using one-dimensional index structure such as $B^+$-tree. However, the performance of the MD-NAI is very degraded in some cases of user's query types. In this paper, for the performance enhancement of the MD-NAI, we first determine the optimal shape of index page region by using the query information about the nested predicates, and then construct an optimal MD NAI by applying a region splitting strategy that makes the shape of the page regions of the MD-NAI as close as possible to the predetermined optimal one. For performance evaluation, we perform extensive experiments with the MD-NAI using various types of nested predicates and object distribution. The results indicate that our proposed method builds optimal MD-NAI regardless of the query types and object distributions. When the interval ratio of a three-dimensional query region is 1:16:236, the performance of the proposed method is enhanced by as much as 5.5 times over that of the conventional method employing the cyclic splitting strategy.

  • PDF

The Recognition of Occluded 2-D Objects Using the String Matching and Hash Retrieval Algorithm (스트링 매칭과 해시 검색을 이용한 겹쳐진 이차원 물체의 인식)

  • Kim, Kwan-Dong;Lee, Ji-Yong;Lee, Byeong-Gon;Ahn, Jae-Hyeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.7
    • /
    • pp.1923-1932
    • /
    • 1998
  • This paper deals with a 2-D objects recognition algorithm. And in this paper, we present an algorithm which can reduce the computation time in model retrieval by means of hashing technique instead of using the binary~tree method. In this paper, we treat an object boundary as a string of structural units and use an attributed string matching algorithm to compute similarity measure between two strings. We select from the privileged strings a privileged string wIth mmimal eccentricity. This privileged string is treated as the reference string. And thell we wllstructed hash table using the distance between privileged string and the reference string as a key value. Once the database of all model strings is built, the recognition proceeds by segmenting the scene into a polygonal approximation. The distance between privileged string extracted from the scene and the reference string is used for model hypothesis rerieval from the table. As a result of the computer simulation, the proposed method can recognize objects only computing, the distance 2-3tiems, while previous method should compute the distance 8-10 times for model retrieval.

  • PDF

A Multi-dimensional Range Query Processing using Space Filling Curves (공간 순서화 곡선을 이용한 다차원 영역 질의 처리)

  • Back, Hyun;Won, Jung-Im;Yoon, Jee-Hee
    • Journal of Korea Spatial Information System Society
    • /
    • v.8 no.2 s.17
    • /
    • pp.13-38
    • /
    • 2006
  • Range query is one of the most important operations for spatial objects, it retrieves all spatial objects that overlap a given query region in multi-dimensional space. The DOT(DOuble Transformation) is known as an efficient indexing methods, it transforms the MBR of a spatial object into a single numeric value using a space filling curve, and stores the value in a $B^+$-tree. The DOT index is possible to be employed as a primary index for spatial objects. However, the range query processing based on the DOT index requires much overhead for spatial transformations to get the query region in the final space. Also, the detailed range query processing method for 2-dimensional spatial objects has not been studied yet in this paper, we propose an efficient multi-dimensional range query processing technique based on the DOT index. The proposed technique exploits the regularities in the moving patterns of space filling curves to divide a query region into a set of maximal sub-legions within which space filling curves traverse without interruption. Such division reduces the number of spatial transformations required to perform the range query and thus improves the performance of range query processing. A visual simulator is developed to show the evaluation method and the performance of our technique.

  • PDF