• Title/Summary/Keyword: Indexing searching

Search Result 147, Processing Time 0.022 seconds

k-Bitmap Clustering Method for XML Data based on Relational DBMS (관계형 DBMS 기반의 XML 데이터를 위한 k-비트맵 클러스터링 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.845-850
    • /
    • 2009
  • Use of XML data has been increased with growth of Web 2.0 environment. XML is recognized its advantages by using based technology of RSS or ATOM for transferring information from blogs and news feed. Bitmap clustering is a method to keep index in main memory based on Relational DBMS, and which performed better than the other XML indexing methods during the evaluation. Existing method generates too many clusters, and it causes deterioration of result of searching quality. This paper proposes k-Bitmap clustering method that can generate user defined k clusters to solve above-mentioned problem. The proposed method also keeps additional inverted index for searching excluded terms from representative bits of k-Bitmap. We performed evaluation and the result shows that the users can control the number of clusters. Also our method has high recall value in single term search, and it guarantees the searching result includes all related documents for its query with keeping two indices.

A RFID Tag Indexing Scheme Using Spatial Index (공간색인을 이용한 RFID 태그관리 기법)

  • Joo, Heon-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.7
    • /
    • pp.89-95
    • /
    • 2009
  • This paper proposes a tag indexing scheme for RFID tag using spatial index. The tag being used for the inventory management and the tag's location is determined by the position of readers. Therefore, the reader recognizes the tag, which is attached products and thereby their positions can be traced down. In this paper, we propose hTag-tree( Hybrid Tag index) which manages RFID tag attached products. hTag-tree is a new index, which is based on tag's attributes with fast searching, and this tag index manages RFID tags using reader's location. This tag index accesses rapidly to tags for insertion, deletion and updating in dynamic environment. This can minimize the number of node accesses in tag searching comparing to previous techniques. Also, by the extension of MER in present tag index, it is helpful to stop the lowering of capacity which can be caused by parent node approach. The proposed index experiment deals with the comparison of tag index. Fixed Interval R-tree, and present spatial index, R-tree comparison. As a result, the amount of searching time is significantly shortened through hTag-tree node access in data search. This shows that the use of proposed index improves the capacity of effective management of a large amount of RFID tag.

Delay Operation Techniques for Efficient MR-Tree on Nand Flash Memory (낸드 플래시 메모리 상에서 효율적인 MR-트리 동작을 위한 지연 연산 기법)

  • Lee, Hyun-Seung;Song, Ha-Yoon;Kim, Kyung-Chang
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.8
    • /
    • pp.758-762
    • /
    • 2008
  • Embedded systems usually utilize Flash Memories with very nice characteristics of non-volatility, low access time, low power and so on. For the multimedia database systems, R-tree is an indexing tree with nice characteristics for multimedia access. MR-tree, which is an upgraded version of R-tree, has shown better performance in searching, inserting and deleting operations than R-tree. Flash memory has sectors and blocks as a unit of read, write and delete operations. Especially, the delete is done on a unit of 512 byte blocks with very large operation time and it is also known that read and write operations on a unit of block matches caching nature of MT-tree. Our research optimizes MR-tree operations in a unit of Flash memory blocks. Such an adjusting leads in better indexing performance in database accesses. With MR-tree on a 512B block units we achieved fast search time of database indexing with low height of MR-tree as well as faster update time of database indexing with the best fit of flash memory blocks. Thus MR-tree with optimized operations shows good characteristics to be a database index schemes on any systems with flash memory.

Efficient Query Indexing for Short Interval Query (짧은 구간을 갖는 범위 질의의 효율적인 질의 색인 기법)

  • Kim, Jae-In;Song, Myung-Jin;Han, Dae-Young;Kim, Dae-In;Hwang, Bu-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.507-516
    • /
    • 2009
  • In stream data processing system, generally the interval queries are in advance registered in the system. When a data is input to the system continuously, for realtime processing, a query indexing method is used to quickly search queries. Thus, a main memory-based query index with a small storage cost and a fast search time is needed for searching queries. In this paper, we propose a LVC-based(Limited Virtual Construct-based) query index method using a hashing to meet the both needs. In LVC-based query index, we divide the range of a stream into limited virtual construct, or LVC. We map each interval query to its corresponding LVC and the query ID is stored on each LVC. We have compared with the CEI-based query indexing method through the simulation experiment. When the range of values of input stream is broad and there are many short interval queries, the LVC-based indexing method have shown the performance enhancement for the storage cost and search time.

An Index Structure for Substructure Searching In Chemical Databases (화학 데이타베이스에서 부분구조 검색을 위한 인덱스 구조)

  • Lee Hwangu;Cha Jaehyuk
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.641-649
    • /
    • 2004
  • The relationship between chemical structures and biological activities is researched briskly in the area of 'Medicinal Chemistry' At the base of these structure-based drug design tries, medicinal chemists search the existing drugs of similar chemical structure to target drug for the development of a new drug. Therefore, it is such necessary that an automatic system selects drug files that have a set of chemical moieties matching a user-defined query moiety. Substructure searching is the process of identifying a set of chemical moieties that match a specific query moiety. Testing for substructure searching was developed in the late 1950s. In graph theoretical terms, this problem corresponds to determining which graphs in a set are subgraph isomorphic to a specified query moiety. Testing for subgraph isomorphism has been proved, in the general case, to be an NP- complete problem. For the purpose of overcoming this difficulty, there were computational approaches. On the 1990s, a US patent has been granted on an atom-centered indexing scheme, used by the RS3 system; this has the virtue that the indexes generated can be searched by direct text comparison. This system is commercially used(http://www.acelrys.com/rs3). We define the RS3 system's drawback and present a new indexing scheme. The RS3 system treats substructure searching with substring matching by means of expressing chemical structure aspredefined strings. However, it has insufficient 'rerall' and 'precision‘ because it is impossible to index structures uniquely for same atom and same bond. To resolve this problem, we make the minimum-cost- spanning tree for one centered atom and describe a structure with paths per levels. Expressing 2D chemical structure into 1D a string has limit. Therefore, we break 2D chemical structure into 1D structure fragments. We present in this paper a new index technique to improve recall and precision surprisingly.

A Two-level Indexing Method in Flash Memory Environment (플래시 메모리 환경을 위한 이단계 인덱싱 방법)

  • Kim, Jong-Dae;Chang, Ji-Woong;Hwang, Kyu-Jeong;Kim, Sang-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.713-717
    • /
    • 2008
  • Recently, as the capacity of flash memory increases rapidly, efficient indexing methods become crucial for fast searching of a large volume of data stored in flash memory. Flash memory has its unique characteristics: the write operation is much more costly than the read operation and in-place updating is not allowed. In this paper, we propose a novel index structure that significantly reduces the number of write operations and thus supports efficient searches, insertions, and deletions. We verify the superiority of our method by performing extensive experiments.

An Improvement Video Search Method for VP-Tree by using a Trigonometric Inequality

  • Lee, Samuel Sangkon;Shishibori, Masami;Han, Chia Y.
    • Journal of Information Processing Systems
    • /
    • v.9 no.2
    • /
    • pp.315-332
    • /
    • 2013
  • This paper presents an approach for improving the use of VP-tree in video indexing and searching. A vantage-point tree or VP-tree is one of the metric space-based indexing methods used in multimedia database searches and data retrieval. Instead of relying on the Euclidean distance as a measure of search space, the proposed approach focuses on the trigonometric inequality for compressing the search range, which thus, improves the search performance. A test result of using 10,000 video files shows that this method reduced the search time by 5-12%, as compared to the existing method that uses the AESA algorithm.

A Digital Library Prototype - Digital Repository and Diverse Collections (디지털도서관 프로토타입의 구축 -디지털 리포지토리와 컬렉션을 중심으로)

  • 최원태
    • Proceedings of the Korea Database Society Conference
    • /
    • 1998.09a
    • /
    • pp.383-394
    • /
    • 1998
  • This article is an overview of the digital library project, indicating what roles Korea's diverse digital collections may play. Our digital library prototype has simple architecture, consisting of digital repositories, filters, indexing and searching, and clients. Digital repositories include various types of materials and databases. The role of filters is to recognize a format of a document collection and mark the structural components of each of its documents, We are using a database management system (ORACLE and ConText) supporting user-defined functions and access methods that allows us to easily incorporate new object analysis, structuring, and indexing technology into a repository.

  • PDF

An Efficient Frequent Melody Indexing Method to Improve Performance of Query-By-Humming System (허밍 질의 처리 시스템의 성능 향상을 위한 효율적인 빈번 멜로디 인덱싱 방법)

  • You, Jin-Hee;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.34 no.4
    • /
    • pp.283-303
    • /
    • 2007
  • Recently, the study of efficient way to store and retrieve enormous music data is becoming the one of important issues in the multimedia database. Most general method of MIR (Music Information Retrieval) includes a text-based approach using text information to search a desired music. However, if users did not remember the keyword about the music, it can not give them correct answers. Moreover, since these types of systems are implemented only for exact matching between the query and music data, it can not mine any information on similar music data. Thus, these systems are inappropriate to achieve similarity matching of music data. In order to solve the problem, we propose an Efficient Query-By-Humming System (EQBHS) with a content-based indexing method that efficiently retrieve and store music when a user inquires with his incorrect humming. For the purpose of accelerating query processing in EQBHS, we design indices for significant melodies, which are 1) frequent melodies occurring many times in a single music, on the assumption that users are to hum what they can easily remember and 2) melodies partitioned by rests. In addition, we propose an error tolerated mapping method from a note to a character to make searching efficient, and the frequent melody extraction algorithm. We verified the assumption for frequent melodies by making up questions and compared the performance of the proposed EQBHS with N-gram by executing various experiments with a number of music data.

A Design and Development of Big Data Indexing and Search System using Lucene (루씬을 이용한 빅데이터 인덱싱 및 검색시스템의 설계 및 구현)

  • Kim, DongMin;Choi, JinWoo;Woo, ChongWoo
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.107-115
    • /
    • 2014
  • Recently, increased use of the internet resulted in generation of large and diverse types of data due to increased use of social media, expansion of a convergence of among industries, use of the various smart device. We are facing difficulties to manage and analyze the data using previous data processing techniques since the volume of the data is huge, form of the data varies and evolves rapidly. In other words, we need to study a new approach to solve such problems. Many approaches are being studied on this issue, and we are describing an effective design and development to build indexing engine of big data platform. Our goal is to build a system that could effectively manage for huge data set which exceeds previous data processing range, and that could reduce data analysis time. We used large SNMP log data for an experiment, and tried to reduce data analysis time through the fast indexing and searching approach. Also, we expect our approach could help analyzing the user data through visualization of the analyzed data expression.