• Title/Summary/Keyword: tree indexing

Search Result 211, Processing Time 0.022 seconds

Encoding of XML Elements for Mining Association Rules

  • Hu Gongzhu;Liu Yan;Huang Qiong
    • The Journal of Information Systems
    • /
    • v.14 no.3
    • /
    • pp.37-47
    • /
    • 2005
  • Mining of association rules is to find associations among data items that appear together in some transactions or business activities. As of today, algorithms for association rule mining, as well as for other data mining tasks, are mostly applied to relational databases. As XML being adopted as the universal format for data storage and exchange, mining associations from XML data becomes an area of attention for researchers and developers. The challenge is that the semi-structured data format in XML is not directly suitable for traditional data mining algorithms and tools. In this paper we present an encoding method to encode XML tree-nodes. This method is used to store the XML data in Value Table and Transaction Table that can be easily accessed via indexing. The hierarchical relationship in the original XML tree structure is embedded in the encoding. We applied this method to association rules mining of XML data that may have missing data.

  • PDF

An Efficient Indexing Structure for Multidimensional Categorical Range Aggregation Query

  • Yang, Jian;Zhao, Chongchong;Li, Chao;Xing, Chunxiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.597-618
    • /
    • 2019
  • Categorical range aggregation, which is conceptually equivalent to running a range aggregation query separately on multiple datasets, returns the query result on each dataset. The challenge is when the number of dataset is as large as hundreds or thousands, it takes a lot of computation time and I/O. In previous work, only a single dimension of the range restriction has been solved, and in practice, more applications are being used to calculate multiple range restriction statistics. We proposed MCRI-Tree, an index structure designed to solve multi-dimensional categorical range aggregation queries, which can utilize main memory to maximize the efficiency of CRA queries. Specifically, the MCRI-Tree answers any query in $O(nk^{n-1})$ I/Os (where n is the number of dimensions, and k denotes the maximum number of pages covered in one dimension among all the n dimensions during a query). The practical efficiency of our technique is demonstrated with extensive experiments.

A Digital Forensic Analysis for Directory in Windows File System (Windows 파일시스템의 디렉토리에 대한 디지털 포렌식 분석)

  • Cho, Gyusang
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.2
    • /
    • pp.73-90
    • /
    • 2015
  • When we apply file commands on files in a directory, the directory as well as the file suffer changes in timestamps of MFT entry. Based on understanding of these changes, this work provides a digital forensic analysis on the timestamp changes of the directory influenced by execution of file commands. NTFS utilizes B-tree indexing structure for managing efficient storage of a huge number of files and fast lookups, which changes an index tree of the directory index when files are operated by commands. From a digital forensic point of view, we try to understand behaviors of the B-tree indexes and are looking for traces of files to collect information. But it is not easy to analyze the directory index entry when the file commands are executed. And researches on a digital forensic about NTFS directory and B-tree indexing are comparatively rare. Focusing on the fact, we present, in this paper, directory timestamp changes after executing file commands including a creation, a copy, a deletion etc are analyzed and a method for finding forensic evidences of a deletion of directory containing files. With some cases, i.e. examples of file copy and file deletion command, analyses on the problem of timestamp changes of the directory are given and the problem of finding evidences of a deletion of directory containging files are shown.

Bit-Vector-Based Space Partitioning Indexing Scheme for Improving Node Utilization and Information Retrieval (노드 이용률과 검색 속도 개선을 위한 비트 벡터 기반 공간 분할 색인 기법)

  • Yeo, Myung-Ho;Seong, Dong-Ook;Yoo, Jae-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.7
    • /
    • pp.799-803
    • /
    • 2010
  • The KDB-tree is a traditional indexing scheme for retrieving multidimensional data. Much research for KDB-tree family frequently addresses the low storage utilization and insufficient retrieval performance as their two bottlenecks. The bottlenecks occur due to a number of unnecessary splits caused by data insertion orders and data skewness. In this paper, we propose a novel index structure, called as $KDB_{CS}^+$-tree, to process skewed data efficiently and improve the retrieval performance. The $KDB_{CS}^+$-tree increases the number of fan-outs by exploiting bit-vectors for representing splitting information and pointer elimination. It also improves the storage utilization by representing entries as a hierarchical structure in each internal node.

Design and Implementation of the dynamic hashing structure for indexing the current positions of moving objects (이동체의 현재 위치 색인을 위한 동적 해슁 구조의 설계 및 구현)

  • 전봉기
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.6
    • /
    • pp.1266-1272
    • /
    • 2004
  • Location-Based Services(LBS) give rise to location-dependent queries of which results depend on the positions of moving objects. Because positions of moving objects change continuously, indexes of moving object must perform update operations frequently for keeping the changed position information. Existing spatial index (Grid File, R-Tree, KDB-tree etc.) proposed as index structure to search static data effectively. There are not suitable for index technique of moving object database that position data is changed continuously. In this paper, I propose a dynamic hashing index that insertion/delete costs are low. The dynamic hashing structure is that apply dynamic hashing techniques to combine a hash and a tree to a spatial index. The results of my extensive experiments show the dynamic hashing index outperforms the $R^$ $R^*$-tree and the fixed grid.

GC-Tree: A Hierarchical Index Structure for Image Databases (GC-트리 : 이미지 데이타베이스를 위한 계층 색인 구조)

  • 차광호
    • Journal of KIISE:Databases
    • /
    • v.31 no.1
    • /
    • pp.13-22
    • /
    • 2004
  • With the proliferation of multimedia data, there is an increasing need to support the indexing and retrieval of high-dimensional image data. Although there have been many efforts, the performance of existing multidimensional indexing methods is not satisfactory in high dimensions. Thus the dimensionality reduction and the approximate solution methods were tried to deal with the so-called dimensionality curse. But these methods are inevitably accompanied by the loss of precision of query results. Therefore, recently, the vector approximation-based methods such as the VA- file and the LPC-file were developed to preserve the precision of query results. However, the performance of the vector approximation-based methods depend largely on the size of the approximation file and they lose the advantages of the multidimensional indexing methods that prune much search space. In this paper, we propose a new index structure called the GC-tree for efficient similarity search in image databases. The GC-tree is based on a special subspace partitioning strategy which is optimized for clustered high-dimensional images. It adaptively partitions the data space based on a density function and dynamically constructs an index structure. The resultant index structure adapts well to the strongly clustered distribution of high-dimensional images.

2D-THI: Two-Dimensional Type Hierarchy Index for XML Databases (2D-THI: XML 데이테베이스를 위한 이차원 타입상속 계층색인)

  • Lee Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.3
    • /
    • pp.265-278
    • /
    • 2006
  • This paper presents a two-dimensional type inheritance hierarchy index(2D-THI) for XML databases. XML Schema is one of schema models for the XML documents supporting. The type inheritance. The conventional indexing techniques for XML databases can not support XML queries on type inheritance hierarchies. We construct a two-dimensional index structure using multidimensional file organizations for supporting type inheritance hierarchy in XML queries. This indexing technique deals with the problem of clustering index entries in the two-dimensional domain space that consists of a key element domain and a type identifier domain based on the user query pattern. This index enhances query performance by adjusting the degree of clustering between the two domains. For performance evaluation, we have compared our proposed 2D-THI with the conventional class hierarchy indexing techniques in object-oriented databases such as CH-index and CG-tree through the cost model. As the result of the performance evaluations, we have verified that our proposed two-dimensional type inheritance indexing technique can efficiently support the query Processing in XML databases according to the query types.

  • PDF

Development of the Spatial Indexing Method for the Effective Visualization of BIM data based on GIS (GIS 기반 BIM 데이터의 효과적 가시화를 위한 공간인덱싱 기법 개발)

  • Kim, Ji-Eun;Kang, Tae-Wook;Hong, Chang-Hee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.8
    • /
    • pp.5333-5341
    • /
    • 2014
  • Recently, with the increasing interest in facility management based on indoor spatial information, various studies have been attempted to manage facility conversion between BIM and GIS. Visualization of the geometry data for a large-scale is one of the major issues to the maintenance system. Therefore, this study designed the spatial indexing algorithm through an IFC schema-based scenario for the effective visualization of BIM data based on GIS. A part of the algorithm was developed implementing the OcTree structure and this research has a test for the developed output with IFC sample data. Ultimately, we propose the spatial indexing method for the effective visualization of BIM data based on GIS.

A RFID Tag Indexing Scheme Using Spatial Index (공간색인을 이용한 RFID 태그관리 기법)

  • Joo, Heon-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.7
    • /
    • pp.89-95
    • /
    • 2009
  • This paper proposes a tag indexing scheme for RFID tag using spatial index. The tag being used for the inventory management and the tag's location is determined by the position of readers. Therefore, the reader recognizes the tag, which is attached products and thereby their positions can be traced down. In this paper, we propose hTag-tree( Hybrid Tag index) which manages RFID tag attached products. hTag-tree is a new index, which is based on tag's attributes with fast searching, and this tag index manages RFID tags using reader's location. This tag index accesses rapidly to tags for insertion, deletion and updating in dynamic environment. This can minimize the number of node accesses in tag searching comparing to previous techniques. Also, by the extension of MER in present tag index, it is helpful to stop the lowering of capacity which can be caused by parent node approach. The proposed index experiment deals with the comparison of tag index. Fixed Interval R-tree, and present spatial index, R-tree comparison. As a result, the amount of searching time is significantly shortened through hTag-tree node access in data search. This shows that the use of proposed index improves the capacity of effective management of a large amount of RFID tag.

Indexing Techniques or Nested Attributes of OODB Using a Multidimensional Index Structure (다차원 파일구조를 이용한 객체지향 데이터베이스의 중포속성 색인기법)

  • Lee, Jong-Hak
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2298-2309
    • /
    • 2000
  • This paper proposes the multidimensioa! nested attribute indexing techniques (MD- NAI) in object-oriented databases using a multidimensional index structure. Since most conventional indexing techniques for object oriented databases use a one-dimensional index stnlcture such as the B-tree, they do not often handle complex qUlTies involving both nested attributes and class hierarchies. We extend a tunable two dimensional class hierachy indexing technique(2D-CHI) for nested attributes. The 2D-CHI is an indexing scheme that deals with the problem of clustering ohjects in a two dimensional domain space that consists of a kev attribute dOI11'lin and a class idmtifier domain for a simple attribute in a class hierachy. In our extended scheme, we construct indexes using multidimensional file organizations that include one class identifier domain per class hierarchy on a path expression that defines the indexed nested attribute. This scheme efficiently suppoI1s queries that involve search conditions on the nested attribute represcnted by an extcnded path expression. An extended path expression is a one in which a class hierarchy can be substituted by an indivisual class or a subclass hierarchy in the class hierarchy.

  • PDF