• Title/Summary/Keyword: Query efficiency

Search Result 261, Processing Time 0.032 seconds

An Efficient ROLAP Cube Generation Scheme (효율적인 ROLAP 큐브 생성 방법)

  • Kim, Myung;Song, Ji-Sook
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.99-109
    • /
    • 2002
  • ROLAP(Relational Online Analytical Processing) is a process and methodology for a multidimensional data analysis that is essential to extract desired data and to derive value-added information from an enterprise data warehouse. In order to speed up query processing, most ROLAP systems pre-compute summary tables. This process is called 'cube generation' and it mostly involves intensive table sorting stages. (1) showed that it is much faster to generate ROLAP summary tables indirectly using a MOLAP(multidimensional OLAP) cube generation algorithm. In this paper, we present such an indirect ROLAP cube generation algorithm that is fast and scalable. High memory utilization is achieved by slicing the input fact table along one or more dimensions before generating summary tables. High speed is achieved by producing summary tables from their smallest parents. We showed the efficiency of our algorithm through experiments.

A Design of Parallel Processing System for Management of Moving Objects (이동체 관리를 위한 다중 처리 시스템의 설계)

  • 김진덕;강구안;육정수;박연식
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.345-349
    • /
    • 2004
  • In order to index exactly moving objects(vehicle, mobile phone, PDA, etc.) in the mobile database, continuous updates of their locations are inevitable as well as time-consuming. The studies of pure spatial indices have focused on the efficient retrievals. However, the acquisition and management of the terminal Location of moving objects are more important than the efficiency of the query processing in the moving object databases. Therefore, it will be need to adopt parallel processing system for the moving object databases which should maintain the object's current location as precise as possible. This paper proposes a architecture of spatial indexing mobile objects using multiple processors. More precisely, we newly propose a method of splitting buckets using the properties of moving objects in order to minimize the number of database updates. We also propose a acquisition method for gathering the location information of moving objects and passing the information of the bucket extents in order to reduce the amount of passed messages between processors.

  • PDF

Lost and Found Registration and Inquiry Management System for User-dependent Interface using Automatic Image Classification and Ranking System based on Deep Learning (딥 러닝 기반 이미지 자동 분류 및 랭킹 시스템을 이용한 사용자 편의 중심의 유실물 등록 및 조회 관리 시스템)

  • Jeong, Hamin;Yoo, Hyunsoo;You, Taewoo;Kim, Yunuk;Ahn, Yonghak
    • Convergence Security Journal
    • /
    • v.18 no.4
    • /
    • pp.19-25
    • /
    • 2018
  • In this paper, we propose an user-centered integrated lost-goods management system through a ranking system based on weight and a hierarchical image classification system based on Deep Learning. The proposed system consists of a hierarchical image classification system that automatically classifies images through deep learning, and a ranking system modules that listing the registered lost property information on the system in order of weight for the convenience of the query process.In the process of registration, various information such as category classification, brand, and related tags are automatically recognized by only one photograph, thereby minimizing the hassle of users in the registration process. And through the ranking systems, it has increased the efficiency of searching for lost items by exposing users frequently visited lost items on top. As a result of the experiment, the proposed system allows users to use the system easily and conveniently.

  • PDF

Study of Improvement of Search Range Compression Method of VP-tree for Video Indexes (영상 색인용 VP-tree의 검색 범위 압축법의 개선에 관한 연구)

  • Park, Gil-Yang;Lee, Samuel Sang-Kon;Hwang, Jea-Jeong
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.2
    • /
    • pp.215-225
    • /
    • 2012
  • In multimedia database, a multidimensional space-based indexing has been used to increase search efficiency. However, this method is inefficient in terms of ubiquity because it uses Euclidean distance as a scale of distance calculation. On the contrary, a metric space-based indexing method, in which metric axiom is prerequisite is widely available because a metric scale other than Euclidean distance could be used. This paper is attempted to propose a way of improving VP-tree, one of the metric space indexing methods. The VP-tree calculates the distance with an object which is ultimately linked to the a leaf node depending on the node fit for the search range from a root node and examines if it is appropriate with the search range. Because search speed decreases as the number of distance calculations at the leaf node increases, however, this paper has proposed a method which uses the latest interface on query object as the base point of trigonometric inequality for improvement after focusing on the trigonometric inequality-based range compression method in a leaf node. This improvement method would be able to narrow the search range and reduce the number of distance calculations. According to a system performance test using 10,000 video data, the new method reduced search time for similar videos by 5-12%, compared to a conventional method.

Efficient and Privacy-Preserving Near-Duplicate Detection in Cloud Computing (클라우드 환경에서 검색 효율성 개선과 프라이버시를 보장하는 유사 중복 검출 기법)

  • Hahn, Changhee;Shin, Hyung June;Hur, Junbeom
    • Journal of KIISE
    • /
    • v.44 no.10
    • /
    • pp.1112-1123
    • /
    • 2017
  • As content providers further offload content-centric services to the cloud, data retrieval over the cloud typically results in many redundant items because there is a prevalent near-duplication of content on the Internet. Simply fetching all data from the cloud severely degrades efficiency in terms of resource utilization and bandwidth, and data can be encrypted by multiple content providers under different keys to preserve privacy. Thus, locating near-duplicate data in a privacy-preserving way is highly dependent on the ability to deduplicate redundant search results and returns best matches without decrypting data. To this end, we propose an efficient near-duplicate detection scheme for encrypted data in the cloud. Our scheme has the following benefits. First, a single query is enough to locate near-duplicate data even if they are encrypted under different keys of multiple content providers. Second, storage, computation and communication costs are alleviated compared to existing schemes, while achieving the same level of search accuracy. Third, scalability is significantly improved as a result of a novel and efficient two-round detection to locate near-duplicate candidates over large quantities of data in the cloud. An experimental analysis with real-world data demonstrates the applicability of the proposed scheme to a practical cloud system. Last, the proposed scheme is an average of 70.6% faster than an existing scheme.

SSQUSAR : A Large-Scale Qualitative Spatial Reasoner Using Apache Spark SQL (SSQUSAR : Apache Spark SQL을 이용한 대용량 정성 공간 추론기)

  • Kim, Jonghoon;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.2
    • /
    • pp.103-116
    • /
    • 2017
  • In this paper, we present the design and implementation of a large-scale qualitative spatial reasoner, which can derive new qualitative spatial knowledge representing both topological and directional relationships between two arbitrary spatial objects in efficient way using Aparch Spark SQL. Apache Spark SQL is well known as a distributed parallel programming environment which provides both efficient join operations and query processing functions over a variety of data in Hadoop cluster computer systems. In our spatial reasoner, the overall reasoning process is divided into 6 jobs such as knowledge encoding, inverse reasoning, equal reasoning, transitive reasoning, relation refining, knowledge decoding, and then the execution order over the reasoning jobs is determined in consideration of both logical causal relationships and computational efficiency. The knowledge encoding job reduces the size of knowledge base to reason over by transforming the input knowledge of XML/RDF form into one of more precise form. Repeat of the transitive reasoning job and the relation refining job usually consumes most of computational time and storage for the overall reasoning process. In order to improve the jobs, our reasoner finds out the minimal disjunctive relations for qualitative spatial reasoning, and then, based upon them, it not only reduces the composition table to be used for the transitive reasoning job, but also optimizes the relation refining job. Through experiments using a large-scale benchmarking spatial knowledge base, the proposed reasoner showed high performance and scalability.

Field Mapping based on Virtual Office for Real time GIS in Field Survey for Natural Environment (자연환경조사에서 실시간 GIS구현을 위한 가상사무실 기반의 필드멥핑)

  • 엄정섭;김희두
    • Spatial Information Research
    • /
    • v.9 no.1
    • /
    • pp.51-72
    • /
    • 2001
  • It is frequently pointed out that the conventional field survey for natural environment has may limitations in terms of positional accuracy, real-time GIS data acquisition, and economic efficiency. The aim of this research was to develop an on site real-time mapping technique that enables the surveyor to input data in the field. The idea is based upon the recent trends in the field of Telecommunication and Information Technology that uses a GPS, wireless network computing, moving computing, etc. A virtual office approach has been adopted, in which a portable computer is linked to a GPS and field workers record data on the computer at the site and analyse data on site. This field mapping system has shown to be much less susceptible to the positional accuracy than that of th conventional approach. The Graphic User Interface, in particular, were ideally suited to combining positional information with attribute data which changes with every survey points. This interface allows users to interactively display and query GIS layers reproduced from the past survey results. The GIS database stored in the virtual office will serve to carry out a highly reliable survey since it could play a crucial role in identifying temporal and spatial changes occurred in the site. It is expected that integrated utilization of field data among the related agencies would be increased much more than before since the virtual office survey would be a powerful tool to ensure geometric fidelity in GIS database creation process. This paper also discusses the limitations and future direction of the present prototype research.

  • PDF

Performance Comparison of Spatial Split Algorithms for Spatial Data Analysis on Spark (Spark 기반 공간 분석에서 공간 분할의 성능 비교)

  • Yang, Pyoung Woo;Yoo, Ki Hyun;Nam, Kwang Woo
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.25 no.1
    • /
    • pp.29-36
    • /
    • 2017
  • In this paper, we implement a spatial big data analysis prototype based on Spark which is an in-memory system and compares the performance by the spatial split algorithm on this basis. In cluster computing environments, big data is divided into blocks of a certain size order to balance the computing load of big data. Existing research showed that in the case of the Hadoop based spatial big data system, the split method by spatial is more effective than the general sequential split method. Hadoop based spatial data system stores raw data as it is in spatial-divided blocks. However, in the proposed Spark-based spatial analysis system, there is a difference that spatial data is converted into a memory data structure and stored in a spatial block for search efficiency. Therefore, in this paper, we propose an in-memory spatial big data prototype and a spatial split block storage method. Also, we compare the performance of existing spatial split algorithms in the proposed prototype. We presented an appropriate spatial split strategy with the Spark based big data system. In the experiment, we compared the query execution time of the spatial split algorithm, and confirmed that the BSP algorithm shows the best performance.

Long-term Location Data Management for Distributed Moving Object Databases (분산 이동 객체 데이타베이스를 위한 과거 위치 정보 관리)

  • Lee, Ho;Lee, Joon-Woo;Park, Seung-Yong;Lee, Chung-Woo;Hwang, Jae-Il;Nah, Yun-Mook
    • Journal of Korea Spatial Information System Society
    • /
    • v.8 no.2 s.17
    • /
    • pp.91-107
    • /
    • 2006
  • To handling the extreme situation that must manage positional information of a very large volume, at least millions of moving objects. A cluster-based sealable distributed computing system architecture, called the GALIS which consists of multiple data processors, each dedicated to keeping records relevant to a different geographical zone and a different time zone, was proposed. In this paper, we proposed a valid time management and time-zone shifting scheme, which are essential in realizing the long-term location data subsystem of GALIS, but missed in our previous prototype development. We explain how to manage valid time of moving objects to avoid ambiguity of location information. We also describe time-zone shifting algorithm with three variations, such as Real Time-Time Zone Shifting, Batch-Time Zone Shifting, Table Partitioned Batch-Time Zone Shifting, Through experiments related with query processing time and CPU utilization, we show the efficiency of the proposed time-zone shifting schemes.

  • PDF

A Feature -Based Word Spotting for Content-Based Retrieval of Machine-Printed English Document Images (내용기반의 인쇄체 영문 문서 영상 검색을 위한 특징 기반 단어 검색)

  • Jeong, Gyu-Sik;Gwon, Hui-Ung
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1204-1218
    • /
    • 1999
  • 문서영상 검색을 위한 디지털도서관의 대부분은 논문제목과/또는 논문요약으로부터 만들어진 색인에 근거한 제한적인 검색기능을 제공하고 있다. 본 논문에서는 영문 문서영상전체에 대한 검색을 위한 단어 영상 형태 특징기반의 단어검색시스템을 제안한다. 본 논문에서는 검색의 효율성과 정확도를 높이기 위해 1) 기존의 단어검색시스템에서 사용된 특징들을 조합하여 사용하며, 2) 특징의 개수 및 위치뿐만 아니라 특징들의 순서를 포함하여 매칭하는 방법을 사용하며, 3) 특징비교에 의해 검색결과를 얻은 후에 여과목적으로 문자인식을 부분적으로 적용하는 2단계의 검색방법을 사용한다. 제안된 시스템의 동작은 다음과 같다. 문서 영상이 주어지면, 문서 영상 구조가 분석되고 단어 영역들의 조합으로 분할된다. 단어 영상의 특징들이 추출되어 저장된다. 사용자의 텍스트 질의가 주어지면 이에 대응되는 단어 영상이 만들어지며 이로부터 영상특징이 추출된다. 이 참조 특징과 저장된 특징들과 비교하여 유사한 단어를 검색하게 된다. 제안된 시스템은 IBM-PC를 이용한 웹 환경에서 구축되었으며, 영문 문서영상을 이용하여 실험이 수행되었다. 실험결과는 본 논문에서 제안하는 방법들의 유효성을 보여주고 있다. Abstract Most existing digital libraries for document image retrieval provide a limited retrieval service due to their indexing from document titles and/or the content of document abstracts. This paper proposes a word spotting system for full English document image retrieval based on word image shape features. In order to improve not only the efficiency but also the precision of a retrieval system, we develop the system by 1) using a combination of the holistic features which have been used in the existing word spotting systems, 2) performing image matching by comparing the order of features in a word in addition to the number of features and their positions, and 3) adopting 2 stage retrieval strategies by obtaining retrieval results by image feature matching and applying OCR(Optical Charater Recognition) partly to the results for filtering purpose. The proposed system operates as follows: given a document image, its structure is analyzed and is segmented into a set of word regions. Then, word shape features are extracted and stored. Given a user's query with text, features are extracted after its corresponding word image is generated. This reference model is compared with the stored features to find out similar words. The proposed system is implemented with IBM-PC in a web environment and its experiments are performed with English document images. Experimental results show the effectiveness of the proposed methods.