• Title/Summary/Keyword: Join query

Search Result 116, Processing Time 0.026 seconds

Cost-based Optimization of Extended Boolean Queries (확장 불리언 질의에 대한 비용 기반 최적화)

  • 박병권
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.3
    • /
    • pp.29-40
    • /
    • 2001
  • In this paper, we suggest a query optimization algorithm to select the optimal processing method of an extended boolean query on inverted files. There can be a lot of methods for processing an extended boolean query according to the processing sequence oh the keywords con tamed in the query, In this sense, the problem of optimizing an extended boolean query it essentially that of optimizing the keyword sequence in the query. In this paper, we show that the problem is basically analogous to the problem of finding the optimal join order in database query optimization, and apply the ideas in the area to the problem solving. We establish the cost model for processing an extended boolean query and develop an algorithm to filled the optimal keyword-processing sequence based on the concept of keyword rank using the keyword selectivity and the access costs of inverted file. We prove that the method selected by the optimization algorithm is really optimum, and show, through experiments, that the optimal method is superior to the others in performance We believe that the suggested optimization algorithm will contribute to the significant enhancement of the information retrieval performance.

  • PDF

An Efficient Spatial Join Method Using DOT Index (DOT 색인을 이용한 효율적인 공간 조인 기법)

  • Back, Hyun;Yoon, Jee-Hee;Won, Jung-Im;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.34 no.5
    • /
    • pp.420-436
    • /
    • 2007
  • The choice of an effective indexing method is crucial to guarantee the performance of the spatial join operator which is heavily used in geographical information systems. The $R^*$-tree based method is renowned as one of the most representative indexing methods. In this paper, we propose an efficient spatial join technique based on the DOT(Double Transformation) index, and compare it with the spatial Join technique based on the $R^*$-tree index. The DOT index transforms the MBR of an spatial object into a single numeric value using a space filling curve, and builds the $B^+$-tree from a set of numeric values transformed as such. The DOT index is possible to be employed as a primary index for spatial objects. The proposed spatial join technique exploits the regularities in the moving patterns of space filling curves to divide a query region into a set of maximal sub-regions within which space filling curves traverse without interruption. Such division reduces the number of spatial transformations required to perform the spatial join and thus improves the performance of join processing. The experiments with the data sets of various distributions and sizes revealed that the proposed join technique is up to three times faster than the spatial join method based on the $R^*$-tree index.

Selectivity Estimation for Spatio-Temporal a Overlap Join (시공간 겹침 조인 연산을 위한 선택도 추정 기법)

  • Lee, Myoung-Sul;Lee, Jong-Yun
    • Journal of KIISE:Databases
    • /
    • v.35 no.1
    • /
    • pp.54-66
    • /
    • 2008
  • A spatio-temporal join is an expensive operation that is commonly used in spatio-temporal database systems. In order to generate an efficient query plan for the queries involving spatio-temporal join operations, it is crucial to estimate accurate selectivity for the join operations. Given two dataset $S_1,\;S_2$ of discrete data and a timestamp $t_q$, a spatio-temporal join retrieves all pairs of objects that are intersected each other at $t_q$. The selectivity of the join operation equals the number of retrieved pairs divided by the cardinality of the Cartesian product $S_1{\times}S_2$. In this paper, we propose aspatio-temporal histogram to estimate selectivity of spatio-temporal join by extending existing geometric histogram. By using a wide spectrum of both uniform dataset and skewed dataset, it is shown that our proposed method, called Spatio-Temporal Histogram, can accurately estimate the selectivity of spatio-temporal join. Our contributions can be summarized as follows: First, the selectivity estimation of spatio-temporal join for discrete data has been first attempted. Second, we propose an efficient maintenance method that reconstructs histograms using compression of spatial statistical information during the lifespan of discrete data.

OLAP-based Big Table Generation for Efficient Analysis of Large-sized IoT Data (대용량 IoT 데이터의 빠른 분석을 위한 OLAP 기반의 빅테이블 생성 방안)

  • Lee, Dohoon;Jo, Chanyoung;On, Byung-Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.2-5
    • /
    • 2021
  • With the recent development of the Internet of Things (IoT) technology, various terminals are being connected to the Internet. As a result, the amount of IoT data is also increasing, and an index key that can efficient analyze the large-scale IoT data is proposed. Existing index keys have only time and space information, so if data stored in index tables and instance tables were queried using repetition or join operation, IoT data was embedded in the index key of the proposal to create OLAP-based big tables to minimize the number of repetitions or join times.

  • PDF

An Efficient Index Structure for Bottom-Up Query Processing of XML Documents (XML 문서의 상향식 질의처리를 지원하는 효율적인 색인구조)

  • Seo Dong-Min;Kim Eun-Jae;Seong Dong-Ook;Yoo Jae-Soo;Cho Ki-Hyung
    • Journal of Internet Computing and Services
    • /
    • v.7 no.4
    • /
    • pp.101-113
    • /
    • 2006
  • A path query is used in XML. Several index structures have been studied for processing the path query efficiently. In recent. the index schemes using suffix tree with structure join method were proposed. ViST is the most representative method among such methods. ViST processes the query using suffix tree and uses B+-tree to reduce the search time of the documents. However, it significantly degrades the search performance when processing the path query. The reason is that it regards the element that is not ancestor-descendant relation in the document as a descendent. In this paper, we propose an efficient index structure to solve the problem of ViST. The query processing method suitable to the index structure is also proposed. It is shown through various experiments that the proposed index structure outperforms the existing index structure in terms of the query processing time.

  • PDF

An Efficient Unicast using ODMRP in Ad Hoc Networks (Ad-hoc망에서 ODMRP을 사용한 효율적인 유니캐스트 라우팅 프로토콜)

  • Back, Kyung-Ho;Park, Jae-Woo;Lee, Kyoon-Ha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05b
    • /
    • pp.1145-1148
    • /
    • 2003
  • 본 논문에서는 Ad-hoc 망의 멀티캐스트 라우팅 프로토콜인 ODMRP(On-Demand Multicast Routing Protocol)에서 효율적인 유니캐스트 라우팅 프로토콜을 제안한다. ODMRP는 네트워크 상에서 멀티캐스트그룹의 송신원으로부터 수신원에 이르는 경로상에 있는 노드들을 Fe(Forwarding Group) 노드로 선출하여 이들이 해당 멀티 캐스트그룹에 속하는 패킷을 플러딩 하도록 함으로써 데이터를 전송하는 방안이다. 이러한 ODMRP에서는 어느 한 노드가 유니캐스트 모드로(end-to end) 데이터를 전송해야 하는 경우 경로를 찾기 위해 주기적인 플러딩 과정을 거쳐야 하고 이로 인해 오버헤드가 발생하게 된다. 본 논문에서는 이 문제점을 해결하고자 유니캐스트 모드에서 송신원에서 찾은 경로를 DR 라우팅 테이블에 저장해두고 데이터를 보련 때 DR 라우팅 테이블의 정보를 참조함으로써 수신원에서의 불필요한 컨트롤 패킷(JOIN QUERY, JOIN REPLY)으로 인한 트래픽을 줄일 수 있는 방안을 제안한다. 또한 제안된 방식이 기존의 ODMRP 방식보다 데이터의 전송 시간과 경로를 찾는 시간에 있어 개선되었음을 시뮬레이션을 통해 입증한다.

  • PDF

Distributed database replicator without locking base relations

  • Lee, Wookey;Kang, Sukho;Park, Jooseok
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1996.10a
    • /
    • pp.93-95
    • /
    • 1996
  • A replication server is considered to be one of the most effective tools to cope with the problems that may be caused by the complex data replications in distributed database systems. In the distributed environment, locking a table is inevitable and it is the main reason to coerce the system practically. This paper presents an Asynchronous Replicator Scheme (ARS) that basically utilizes the system log as files named differential files to refresh the distributed data files with complicated queries, and that it prevents (normally, huge) base tables from being locked. We take join operations as the complicated queries, not only because the join operation covers almost all the operations, but also because it is one of the most time-consuming and data intensive operations in query processings.

  • PDF

An enhanced unicast of ODMRP scheme for Ad-hoc Networks (Ad-hoc 망에서 유니캐스트 성능 향상을 위한 개선된 ODMRP)

  • 백경호;박재우;이제원;이균하
    • Proceedings of the IEEK Conference
    • /
    • 2003.07a
    • /
    • pp.157-160
    • /
    • 2003
  • ODMRP is protocol that support multicast and unicast in Ad-hoc network. When some one node must transmit data by unicast way in this ODMRP, must pass through periodic flooding process to find a path and overhead happens thereby. Our scheme stores the found path into the table in a unicast mode and, when the node sends data, it refers to the DR FG table so that reduces the traffic caused by control packets(JOIN QUERY, JOIN REPLY) of a receiver node, while source/destination nodes flood periodic control packets to look for a path in ODMRP. We present that our scheme is much more improved on the time of looking for a path than existing ODMRP methods by means of the simulation.

  • PDF

A Load Balancing Method using Partition Tuning for Pipelined Multi-way Hash Join (다중 해시 조인의 파이프라인 처리에서 분할 조율을 통한 부하 균형 유지 방법)

  • Mun, Jin-Gyu;Jin, Seong-Il;Jo, Seong-Hyeon
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.180-192
    • /
    • 2002
  • We investigate the effect of the data skew of join attributes on the performance of a pipelined multi-way hash join method, and propose two new harsh join methods in the shared-nothing multiprocessor environment. The first proposed method allocates buckets statically by round-robin fashion, and the second one allocates buckets dynamically via a frequency distribution. Using harsh-based joins, multiple joins can be pipelined to that the early results from a join, before the whole join is completed, are sent to the next join processing without staying in disks. Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. However, this hardware structure is very sensitive to the data skew. Unless the pipelining execution of multiple hash joins includes some dynamic load balancing mechanism, the skew effect can severely deteriorate the system performance. In this parer, we derive an execution model of the pipeline segment and a cost model, and develop a simulator for the study. As shown by our simulation with a wide range of parameters, join selectivities and sizes of relations deteriorate the system performance as the degree of data skew is larger. But the proposed method using a large number of buckets and a tuning technique can offer substantial robustness against a wide range of skew conditions.

Design and Implementation of a Query Processor for Document Management Systems (문서관리시스템을 위한 질의처리기 설계 및 구현)

  • U, Jong-Won;Yun, Seung-Hyeon;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1419-1432
    • /
    • 1999
  • The Document Management System(DMS) is a system which retrieves and manages library information efficiently. Since DMS manages the information using only one table, it does not need to provide join and view operations that spend high cost in traditional DBMS. In addition, DMs requires new operations because of their property. the operation has not been supported in existing DBMSs. In this paper we define a data language which represents the structure definition and process of data on the DMS. Especially we define Ranking and Proximity operation which is needed in Document Retrieval,. We also design and implement a query processor to process the query constructed with the data language. When the exiting query processors of relational DBMS are used as a query processor of DMS, they degrade the whole system performance. The proposed query processor not only overcomes such a problem but also supports new operation which is needed in DMS.

  • PDF