• Title/Summary/Keyword: Array Query Processing

Search Result 10, Processing Time 0.021 seconds

An Index Structure for Efficient X-Path Processing on S-XML Data (S-XML 데이터의 효율적인 X-Path 처리를 위한 색인 구조)

  • Zhang, Gi;Jang, Yong-Il;Park, Soon-Young;Oh, Young-Hwan;Bae, Hae-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.51-54
    • /
    • 2005
  • This paper proposes an index structure which is used to process X-Path on S-XML data. There are many previous index structures based on tree structure for X-Path processing. Because of general tree index's top-down query fashion, the unnecessary node traversal makes heavy access and decreases the query processing performance. And both of the two query types for X-Path called single-path query and branching query need to be supported in proposed index structure. This method uses a combination of path summary and the node indexing. First, it manages hashing on hierarchy elements which are presented in tag in S-XML. Second, array blocks named path summary array is created in each node of hashing to store the path information. The X-Path processing finds the tag element using hashing and checks array blocks in each node to determine the path of query's result. Based on this structure, it supports both single-path query and branching path query and improves the X-Path processing performance.

  • PDF

Suffix Array Based Path Query Processing Scheme for Semantic Web Data (시맨틱 웹 데이터에서 접미사 배열 기반의 경로 질의 처리 기법)

  • Kim, Sung-Wan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.10
    • /
    • pp.107-116
    • /
    • 2012
  • The applying of semantic technologies that aim to let computers understand and automatically process the meaning of the interlinked data on the Web is spreading. In Semantic Web, understanding and accessing the associations between data that is, the meaning between data as well as accessing to the data itself is important. W3C recommended RDF (Resource Description Framework) as a standard format to represent both Semantic Web data and their associations and also proposed several RDF query languages in order to support query processing for RDF data. However further researches on the query language definition considering the semantic associations and query processing techniques are still required. In this paper, using the suffix array-based indexing scheme previously introduced for RDF query processing, we propose a query processing approach to handle ${\rho}$-path query which is the representative type of semantic associations. To evaluate the query processing performance of the proposed approach, we implemented two different types of query processing approaches and measured the average query processing times. The experiments show that the proposed approach achieved 1.8 to 2.5 and 3.8 to 11 times better performance respectively than others two.

Processing of ρ-intersect Operation on RDF Data Using Suffix Array (RDF 데이터에서 접미사 배열을 이용한 ρ-intersect 연산의 처리)

  • Kim, Sung-Wan;Kim, Youn-Hee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.7
    • /
    • pp.95-103
    • /
    • 2011
  • The actual utilization of Semantic Web technology which aims to provide more intelligent and automated service for information retrieval over the Web becomes gradually reality. RDF is widely used as the one of standard formats to present and manage the voluminous data on the Web. Efficient query processing on RDF data, therefore, is one of the ongoing research topics. Retrieving resources having a specific association from a given resource is the typical query processing type and several researches for this have done. However the most of previous researches have not fully considered discovering the complex relationship among resources such as returning the association between resources as the query processing result. This paper introduces the indexing and query processing for ${\rho}$-intersect operation which is one of the semantic association retrieval types. It includes an indexing scheme using suffix array and optimal processing approaches for handling ${\rho}$-intersect operation. The experimental evaluations shows that the average execution times for the proposed approach is 3~7 times faster than the previous approach.

An Efficient Method for Finding Similar Regions in a 2-Dimensional Array Data (2차원 배열 데이터에서 유사 구역의 효율적인 탐색 기법)

  • Choe, YeonJeong;Lee, Ki Yong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.185-192
    • /
    • 2017
  • In various fields of science, 2-dimensional array data is being generated actively as a result of measurements and simulations. Although various query processing techniques for array data are being studied, the problem of finding similar regions, whose sizes are not known in advance, in 2-dimensional array has not been addressed yet. Therefore, in this paper, we propose an efficient method for finding regions with similar element values, whose size is larger than a user-specified value, for a given 2-dimensional array data. The proposed method, for each pair of elements in the array, expands the corresponding two regions, whose initial size is 1, along the right and down direction in stages, keeping the shape of the two regions the same. If the difference between the elements values in the two regions becomes larger than a user-specified value, the proposed method stops the expansion. Consequently, the proposed method can find similar regions efficiently by accessing only those parts that are likely to be similar regions. Through theoretical analysis and various experiments, we show that the proposed method can find similar regions very efficiently.

A Study of Path-based Retrieval for JSON Data Using Suffix Arrays (접미사 배열을 이용한 JSON 데이터의 경로 기반 검색에 대한 연구)

  • Kim, Sung Wan
    • Journal of Creative Information Culture
    • /
    • v.7 no.3
    • /
    • pp.157-165
    • /
    • 2021
  • As the use of various application services utilizing Web and IoT and the need for large amounts of data management expand accordingly, the importance of efficient data expression and exchange scheme and data query processing is increasing. JSON, characterized by its simplicity, is being used in various fields as a format for data exchange and data storage instead of XML, which is a standard data expression and exchange language on the Web. This means that it is important to develop indexing and query processing techniques to effectively access and search large amounts of data expressed in JSON. Therefore, in this paper, we modeled JSON data with a hierarchical structure in a tree form, and proposed indexing and query processing using the path concept. In particular, we designed an index structure using a suffix array widely used in text search and introduced simple and complex path-based JSON data query processing methods.

An Algorithm for Computing Range-Groupby Queries (영역-그룹화 질의 계산 알고리즘)

  • Lee, Yeong-Gu;Mun, Yang-Se;Hwang, Gyu-Yeong
    • Journal of KIISE:Databases
    • /
    • v.29 no.4
    • /
    • pp.247-261
    • /
    • 2002
  • Aggregation is an important operation that affects the performance of OLAP systems. In this paper we define a new class of aggregation queries, called range-groupby queries, and present a method for processing them. A range-groupby query is defined as a query that, for an arbitrarily specified region of an n-dimensional cube, computes aggregations for each combination of values of the grouping attributes. Range-groupby queries are used very frequently in analyzing information in MOLAP since they allow us to summarize various trends in an arbitrarily specified subregion of the domain space. In MOLAP applications, in order to improve the performance of query processing, a method of maintaining precomputed aggregation results, called the prefix-sum array, is widely used. For the case of range-groupby queries, however, maintaining precomputed aggregation results for each combination of the grouping attributes incurs enormous storage overhead. Here, we propose a fast algorithm that can compute range-groupby queries with minimal storage overhead. Our algorithm maintains only one prefix-sum away and still effectively processes range-groupby queries for all possible combinations of the grouping attributes. Compared with the method that maintains a prefix-sum array for each combination of the grouping attributes in an n-dimensional cube, our algorithm reduces the space overhead by (equation omitted), while accessing a similar number of cells.

A Memory Efficient Anti-Collision Protocol to Identify Memoryless RFID Tags

  • Jung, Haejae
    • Journal of Information Processing Systems
    • /
    • v.11 no.1
    • /
    • pp.95-103
    • /
    • 2015
  • This paper presents a memory efficient tree based anti-collision protocol to identify memoryless RFID (Radio Frequency Identification) tags that may be attached to products. The proposed deterministic scheme utilizes two bit arrays instead of stack or queue and requires only ${\Theta}(n)$ space, which is better than the earlier schemes that use at least $O(n^2)$ space, where n is the length of a tag ID in a bit. Also, the size n of each bit array is independent of the number of tags to identify. Our simulation results show that our bit array scheme consumes much less memory space than the earlier schemes utilizing queue or stack.

Minimizing the MOLAP/ROLAP Divide: You Can Have Your Performance and Scale It Too

  • Eavis, Todd;Taleb, Ahmad
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.1
    • /
    • pp.1-20
    • /
    • 2013
  • Over the past generation, data warehousing and online analytical processing (OLAP) applications have become the cornerstone of contemporary decision support environments. Typically, OLAP servers are implemented on top of either proprietary array-based storage engines (MOLAP) or as extensions to conventional relational DBMSs (ROLAP). While MOLAP systems do indeed provide impressive performance on common analytics queries, they tend to have limited scalability. Conversely, ROLAP's table oriented model scales quite nicely, but offers mediocre performance at best relative to the MOLAP systems. In this paper, we describe a storage and indexing framework that aims to provide both MOLAP like performance and ROLAP like scalability by essentially combining some of the best features from both. Based upon a combination of R-trees and bitmap indexes, the storage engine has been integrated with a robust OLAP query engine prototype that is able to fully exploit the efficiency of the proposed storage model. Specifically, it utilizes an OLAP algebra coupled with a domain specific query optimizer, to map user queries directly to the storage and indexing framework. Experimental results demonstrate that not only does the design improve upon more naive approaches, but that it does indeed offer the potential to optimize both query performance and scalability.

Processing of ${\rho}$-intersect Operation for Semantic Association Discovery (시맨틱 연관성 검색을 위한 ${\rho}$-intersect 연산의 처리)

  • Kim, Sung-Wan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2011.01a
    • /
    • pp.285-288
    • /
    • 2011
  • 시맨틱 웹상에서 메타 데이터를 표현하는 RDF 데이터에 대한 질의 처리를 위해 여러 가지 RDF 질의어가 제안되었으나 리소스간의 복잡한 관계성들의 발견(discovery)을 위한 충분한 지원을 하지 못하고 있다. 본 논문에서는 시맨틱 연관성 검색 유형의 하나인 ${\rho}$-intersect 연산의 처리 방법을 소개한다. 이를 위해 접미사 배열을 이용한 인덱싱과 ${\rho}$-intersect 연산의 특징을 고려한 최적화 방법을 활용한다. 제안된 처리 기법을 통해 전형적인 RDF 질의 유형뿐만 아니라 시맨틱 연관성 질의 유형도 지원할 수 있도록 한다.

  • PDF

Efficient Data Scheduling considering number of Spatial query of Client in Wireless Broadcast Environments (무선방송환경에서 클라이언트의 공간질의 수를 고려한 효율적인 데이터 스케줄링)

  • Song, Doohee;Park, Kwangjin
    • Journal of Internet Computing and Services
    • /
    • v.15 no.2
    • /
    • pp.33-39
    • /
    • 2014
  • How to transfer spatial data from server to client in wireless broadcasting environment is shown as following: A server arranges data information that client wants and transfers data by one-dimensional array for broadcasting cycle. Client listens data transferred by the server and returns resulted value only to server. Recently number of users using location-based services is increasing alongside number of objects, and data volume is changing into large amount. Large volume of data in wireless broadcasting environment may increase query time of client. Therefore, we propose Client based Data Scheduling (CDS) for efficient data scheduling in wireless broadcasting environment. CDS divides map and then calculates total sum of objects for each grid by considering number of objects and data size within divided grids. It carries out data scheduling by applying hot-cold method considering total data size of objects for each grid and number of client. It's proved that CDS reduces average query processing time for client compared to existing method.