• Title/Summary/Keyword: Storage and Index Structure

Search Result 94, Processing Time 0.024 seconds

Design and Performance Evaluation of an Indexing Method for Partial String Searches (문자열 부분검색을 위한 색인기법의 설계 및 성능평가)

  • Gang, Seung-Heon;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1458-1467
    • /
    • 1999
  • Existing index structures such as extendable hashing and B+-tree do not support partial string searches perfectly. The inverted file method and the signature file method that are used in the web retrieval engine also have problems that they do not provide partial string searches and suffer from serious retrieval performance degradation respectively. In this paper, we propose an efficient index method that supports partial string searches and achieves good retrieval performance. The proposed index method is based on the Inverted file structure. It constructs the index file with patterns that result from dividing terms by two syllables to support partial string searches. We analyze the characteristics of our proposed method through simulation experiments using wide range of parameter values. We analyze the derive analytic performance evaluation models of the existing inverted file method, signature file method and the proposed index method in terms of retrieval time and storage overhead. We show through performance comparison based on analytic models that the proposed method significantly improves retrieval performance over the existing method.

  • PDF

Concurrency Control and Recovery Methods for Multi-Dimensional Index Structures (다차원 색인구조를 위한 동시성제어 기법 및 회복기법)

  • Song, Seok-Il;Yoo, Jae-Soo
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.195-210
    • /
    • 2003
  • In this paper, we propose an enhanced concurrency control algorithm that maximizes the concurrency of multi-dimensional index structures. The factors that deteriorate the concurrency of index structures are node splits and minimum bounding region (MBR) updates in multi-dimensional index structures. The proposed concurrency control algorithm introduces PLC(Partial Lock Coupling) technique to avoid lock coupling during MBR updates. Also, a new MBR update method that allows searchers to access nodes where MBR updates are being performed is proposed. To reduce the performance degradation by node splits the proposed algorithm holds exclusive latches not during whole split time but only during physical node split time that occupies the small part of a whole split process. For performance evaluation, we implement the proposed concurrency control algorithm and one of the existing link technique-based algorithms on MIDAS-3 that is a storage system of a BADA-4 DBMS. We show through various experiments that our proposed algorithm outperforms the existing algorithm in terms of throughput and response time. Also, we propose a recovery protocol for our proposed concurrency control algorithm. The recovery protocol is designed to assure high concurrency and fast recovery.

XML Repository Model based on the Edge-Labeled Graph (Edge-Labeled Graph를 적용한 XML 저장 모델)

  • 김정희;곽호영
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.5
    • /
    • pp.993-1001
    • /
    • 2003
  • A RDB Storage Model based on the Edge-Labeled Graph is suggested for store the XML instance in Relational Databases(RDB). The XML instance being stored is represented by Data Graph based on the Edge-Labeled Graph. Data Path Table, Element, Attribute, and Table Index Table values are extracted. Then Database Schema is defined, and the extracted values are stored using the Mapper. In order to support querry, Repository Model offers the translator translating XQL which is used as query language under XPATH, into SQL. In addition, it creates DBtoXML generator restoring the stored XML instance. As a result, storage relationship between the XML instance and proposed model structure can be expressed in terms of Graph-based Path, and it shows the possibility of easy search of random Element and Attribute information.

Indexing and Storage Schemes for Keyword-based Query Processing over Semantic Web Data (시맨틱 웹 데이터의 키워드 질의 처리를 위한 인덱싱 및 저장 기법)

  • Kim, Youn-Hee;Shin, Hye-Yeon;Lim, Hae-Chull;Chong, Kyun-Rak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.93-102
    • /
    • 2007
  • Metadata and ontology can be used to retrieve related information through the inference mure accurately and simply on the Semantic Web. RDF and RDF Schema are general languages for representing metadata and ontology. An enormous number of keywords on the Semantic Web are very important to make practical applications of the Semantic Web because most users prefer to search with keywords. In this paper, we consider a resource as a unit of query results. And we classily queries with keyword conditions into three patterns and propose indexing techniques for keyword-search considering both metadata and ontology. Our index maintains resources that contain keywords indirectly using conceptual relationships between resources as well as resources that contain keywords directly. So, if user wants to search resources that contain a certain keyword, all resources are retrieved using our keyword index. We propose a structure of table for storing RDF Schema information that is labeled using some simple methods.

  • PDF

Compressing Method of NetCDF Files Based on Sparse Matrix (희소행렬 기반 NetCDF 파일의 압축 방법)

  • Choi, Gyuyeun;Heo, Daeyoung;Hwang, Suntae
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.11
    • /
    • pp.610-614
    • /
    • 2014
  • Like many types of scientific data, results from simulations of volcanic ash diffusion are of a clustered sparse matrix in the netCDF format. Since these data sets are large in size, they generate high storage and transmission costs. In this paper, we suggest a new method that reduces the size of the data of volcanic ash diffusion simulations by converting the multi-dimensional index to a single dimension and keeping only the starting point and length of the consecutive zeros. This method presents performance that is almost as good as that of ZIP format compression, but does not destroy the netCDF structure. The suggested method is expected to allow for storage space to be efficiently used by reducing both the data size and the network transmission time.

Linear Path Query Processing using Backward Label Path on XML Documents (역방향 레이블 경로를 이용한 XML 문서의 선형 경로 질의 처리)

  • Park, Chung-Hee;Koo, Heung-Seo;Lee, Sang-Joon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.6
    • /
    • pp.766-772
    • /
    • 2007
  • As XML is widely used, many researches on the XML storage and query processing have been done. But, previous works on path query processing have mainly focused on the storage and retrieval methods for a large XML document or XML documents had a same DTD. Those researches did not efficiently process partial match queries on the differently-structured document set. To resolve the problem, we suggested a new index structure using relational table. The method constructs the $B^+$-tree index using backward label paths instead of forward label paths used in previous researches for storing path information and allows for finding the label paths that match the partial match queries efficiently using it when process the queries.

Representing and retrieving the Structured Information of XML Documents (XML 문서에 포함된 구조 정보의 표현과 검색)

  • Jo, Yun-Gi;Jo, Jeong-Gil;Lee, Byeong-Ryeol;Gu, Yeon-Seol
    • The KIPS Transactions:PartD
    • /
    • v.8D no.4
    • /
    • pp.361-366
    • /
    • 2001
  • As growing the number of Webs, the total amount of accessible information has been greater than ever. To storage and retrieve the vast information on the Webs effectively, many researchers have been made utilizing XML (extensible Markup Language). In this paper, we propose an effective method of representation and retrieval mechanism for the structured retrieval of the XML documents : (1) the fixed sized LETID (Leveled Element Type ID) that contains the information of elements such as parent node, sibling nodes, and identical sibling nodes, and the hierachical information of current node, and (2) content index, structure index, attribute index model, and the information retrieval algorithm for the structured information retrieval. With our methods, we can effectively represent the structured information of XML documents, and can directly access the specific elements by simple operations to process various queries.

  • PDF

An RDBMS-based Inverted Index Technique for Path Queries Processing on XML Documents with Different Structures (상이한 구조의 XML문서들에서 경로 질의 처리를 위한 RDBMS기반 역 인덱스 기법)

  • 민경섭;김형주
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.420-428
    • /
    • 2003
  • XML is a data-oriented language to represent all types of documents including web documents. By means of the advent of XML-based document generation tools and grow of proprietary XML documents using those tools and translation from legacy data to XML documents at an accelerating pace, we have been gotten a large amount of differently-structured XML documents. Therefore, it is more and more important to retrieve the right documents from the document set. But, previous works on XML have mainly focused on the storage and retrieval methods for a large XML document or XML documents had a same DTD. And, researches that supported the structural difference did not efficiently process path queries on the document set. To resolve the problem, we suggested a new inverted index mechanism using RDBMS and proved it outperformed the previous works. And especially, as it showed the higher efficiency in indirect containment relationship, we argues that the index structure is fit for the differently-structured XML document set.

Data Model, Query Language, and Indexing Scheme for Structured Video Documents (구조화된 비디오 문서의 데이터 모델 및 질의어와 색인 기법)

  • 류은숙;이규철
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.1
    • /
    • pp.1-17
    • /
    • 1998
  • Video information is an important component of multimedia systems such as Digital Library, World-Wide Web (WWW), and Video-On-Demand (VOD) service system. Video information has hierarchical document structure inherently, so it is named "structure video document" in this paper. This paper proposes a data model, a query language, and an indexing scheme for structured video documents in order to store, retrieve, and share video documents efficiently. In representing structured video documents, the object-oriented data modeling technique is used since the hierarchical structure information can be modeled as complex objects. We also define object types for the structure information. Our query language supports not only content-based retrieval, which means the queries based on the structure of video documents, and spatial/temporal relation for video documents. In order to perform structure queries efficiently, as well as to reduce the storage overhead of indices, an optimized inverted index structure is proposed.

  • PDF

An Efficient Concurrency Control Algorithm for Multi-dimensional Index Structures (다차원 색인구조를 위한 효율적인 동시성 제어기법)

  • 김영호;송석일;유재수
    • Journal of KIISE:Databases
    • /
    • v.30 no.1
    • /
    • pp.80-94
    • /
    • 2003
  • In this paper. we propose an enhanced concurrency control algorithm that minimizes the query delay efficiently. The factors that delay search operations and deteriorate the concurrency of index structures are node splits and MBR updates in multi dimensional index structures. In our algorithm, to reduce the query delay by split operations, we optimize exclusive latching time on a split node. It holds exclusive latches not during whole split time but only during physical node split time that occupies small part of whole split time. Also to avoid the query delay by MBR updates we introduce partial lock coupling(PLC) technique. The PLC technique increases concurrency by using lock coupling only in case of MBR shrinking operations that are less frequent than MBR expansion operations. For performance evaluation, we implement the proposed algorithm and one of the existing link technique-based algorithms on MIDAS-III that is a storage system of a BADA-III DBMS. We show through various experiments that our proposed algorithm outperforms the existing algorithm In terms of throughput and response time.