• Title/Summary/Keyword: Indexing scheme

Search Result 196, Processing Time 0.024 seconds

A Column-Aware Index Management Using Flash Memory for Read-Intensive Databases

  • Byun, Si-Woo;Jang, Seok-Woo
    • Journal of Information Processing Systems
    • /
    • v.11 no.3
    • /
    • pp.389-405
    • /
    • 2015
  • Most traditional database systems exploit a record-oriented model where the attributes of a record are placed contiguously in a hard disk to achieve high performance writes. However, for read-mostly data warehouse systems, the column-oriented database has become a proper model because of its superior read performance. Today, flash memory is largely recognized as the preferred storage media for high-speed database systems. In this paper, we introduce a column-oriented database model based on flash memory and then propose a new column-aware flash indexing scheme for the high-speed column-oriented data warehouse systems. Our index management scheme, which uses an enhanced $B^+$-Tree, achieves superior search performance by indexing an embedded segment and packing an unused space in internal and leaf nodes. Based on the performance results of two test databases, we concluded that the column-aware flash index management outperforms the traditional scheme in the respect of the mixed operation throughput and its response time.

Automated Essay Grading: An Application For Historical Malay Text

  • Syed Mustapha, S.M.F.D;Idris, N.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.237-245
    • /
    • 2001
  • Automated essay grading has been proposed for over thirty years. Only recently have practical implementations been constructed and tested. This paper investigated the role of the nearest-neighbour algorithm within the information retrieval as a way of grading the essay automatically called Automated Essay Grading System. It intended to offer teachers an individualized assistance in grading the student\`s essay. The system involved several processes, which are the indexing, the structuring of the model answer and the grade processing. The indexing process comprised the document indexing and query processing which are mainly used for representing the documents and the query. Structuring the model answer is actually preparing the marking scheme and the grade processing is the process of assessing the essay. To test the effectiveness of the developed algorithms, the algorithms are tested against the History text in Malay. The result showed that th information retrieval and the nearest-neighbour algorithm are practical combination that offer acceptable performance for grading the essay.

  • PDF

Indexing Methods of Splitting XML Documents (XML 문서의 분할 인덱스 기법)

  • Kim, Jong-Myung;Jin, Min
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.3
    • /
    • pp.397-408
    • /
    • 2003
  • Existing indexing mechanisms of XML data using numbering scheme have a drawback of rebuilding the entire index structure when insertion, deletion, and update occurs on the data. We propose a new indexing mechanism based on split blocks to cope with this problem. The XML data are split into blocks, where there exists at most a relationship between two blocks, and numbering scheme is applied to each block. This mechanism reduces the overhead of rebuilding index structures when insertion, deletion, and update occurs on the data. We also propose two algorithms, Parent-Child Block Merge Algorithm and Ancestor-Descendent Algorithm which retrieve the relationship between two entities in the XML hierarchy using this indexing mechanism. We also propose a mechanism in which the identifier of a block has the information of its Parents' block to expedite retrieval process of the ancestor-descendent relationship and also propose two algorithms. Parent-Child Block Merge Algorithm and Ancestor-Descendent Algorithm using this indexing mechanism.

  • PDF

An Efficient Adaptive Bitmap-based Selective Tuning Scheme for Spatial Queries in Broadcast Environments

  • Song, Doo-Hee;Park, Kwang-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.10
    • /
    • pp.1862-1878
    • /
    • 2011
  • With the advances in wireless communication technology and the advent of smartphones, research on location-based services (LBSs) is being actively carried out. In particular, several spatial index methods have been proposed to provide efficient LBSs. However, finding an optimal indexing method that balances query performance and index size remains a challenge in the case of wireless environments that have limited channel bandwidths and device resources (computational power, memory, and battery power). Thus, mechanisms that make existing spatial indexing techniques more efficient and highly applicable in resource-limited environments should be studied. Bitmap-based Spatial Indexing (BSI) has been designed to support LBSs, especially in wireless broadcast environments. However, the access latency in BSI is extremely large because of the large size of the bitmap, and this may lead to increases in the search time. In this paper, we introduce a Selective Bitmap-based Spatial Indexing (SBSI) technique. Then, we propose an Adaptive Bitmap-based Spatial Indexing (ABSI) to improve the tuning time in the proposed SBSI scheme. The ABSI is applied to the distribution of geographical objects in a grid by using the Hilbert curve (HC). With the information in the ABSI, grid cells that have no objects placed, (i.e., 0-bit information in the spatial bitmap index) are not tuned during a search. This leads to an improvement in the tuning time on the client side. We have carried out a performance evaluation and demonstrated that our SBSI and ABSI techniques outperform the existing bitmap-based DSI (B DSI) technique.

Multi-Path Index Scheme for the Efficient Retrieval of XML Data (XML 데이타의 효과적인 검색을 이한 다중 경로 인덱스)

  • Song, Ha-Joo;Kim, Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.1
    • /
    • pp.12-23
    • /
    • 2001
  • Extended path expressions are used to denote multiple paths concisely by using '$\ast$' character. They are convenient for expressing OQL queries to retrieve XML data stored in OODBs. In this paper, we propose a multi-path index scheme as a new index scheme to efficiently process queries with extended path expressions. Our proposed index scheme allocates a unique path identifier for every possible single path in an extended path expression and provides functionalities of both a single path indexing and multiple path indexing through the composition of index key and path identifier while using only a index structure. The proposed index scheme provides better performance than single-path index schemes, and is practical since it can be implemented by little modification of leaf records of a B+-tree index.

  • PDF

The GR-tree: An Energy-Efficient Distributed Spatial Indexing Scheme in Wireless Sensor Networks (GR-tree: 무선 센서 네트워크에서 에너지 효율적인 분산 공간색인기법)

  • Kim, Min-Soo;Jang, In-Sung
    • Spatial Information Research
    • /
    • v.19 no.5
    • /
    • pp.63-74
    • /
    • 2011
  • Recently, there has been much interest in the spatial query which energy-efficiently acquires sensor readings from sensor nodes inside specified geographical area of interests. The centralized approach which performs the spatial query at a server after acquiring all sensor readings, though simple, it incurs high wireless transmission cost in accessing all sensor nodes. In order to remove the high wireless transmission cost, various in-network spatial indexing schemes have been proposed. They have focused on reducing the transmission cost by performing distributed spatial filtering on sensor nodes. However, these in-network spatial indexing schemes have a problem which cannot optimize both the spatial filtering and the wireless routing among sensor nodes, because these schemes have been developed by simply applying the existing spatial indexing schemes into the in-network environment. Therefore, we propose a new distributed spatial indexing scheme of the GR-tree. The GR-tree which form s a MBR-based tree structure, can reduce the wireless transmission cost by optimizing both the efficient spatial filtering and the wireless routing. Finally, we compare the existing spatial indexing scheme through extensive experiments and clarify our approach's distinguished features.

A Study on Organizing the Web Using Facet Analysis (패싯 분석을 이용한 웹 자원의 조직)

  • Yoo, Yeong-Jun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.15 no.1
    • /
    • pp.23-41
    • /
    • 2004
  • In indexing and organizing Web resources, there have been two basic methods: automatic indexing by extracting key words and library classification schemes or subject directories of search engines. But, both methods have failed to satisfy the user's information needs, due to the lack of standard criteria and the irrationality of its structural system. In this paper I have examined the limits of library classification scheme's structures and the problems related to the nature of Web resources such as specificity and exhaustivity. I have also attempted to explain the logicality of Web resources organization by facet analysis and its strengths and limitations. In so doing, I have proposed three specific methods in using facet analysis: firstly, indexing system by facet analysis; secondly, the alternative transformation of the enumerative classification scheme into facet classification scheme; and finally, the facet model of subject directory of domestic search engine. After examining the three methods, my study concludes that a controlled vocabulary by facet analysis can be employed as a useful method in organizing Web resources.

  • PDF

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

Efficient Query Retrieval from Social Data in Neo4j using LIndex

  • Mathew, Anita Brigit
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2211-2232
    • /
    • 2018
  • The unstructured and semi-structured big data in social network poses new challenges in query retrieval. This requirement needs to be met by introducing quality retrieval time measures like indexing. Due to the huge volume of data storage, there originate the need for efficient index algorithms to promote query processing. However, conventional algorithms fail to index the huge amount of frequently obtained information in real time and fall short of providing scalable indexing service. In this paper, a new LIndex algorithm, which is a heuristic on Lucene is built on Neo4jHA architecture that holds the social network Big data. LIndex is a flexible and simplified adaptive indexing scheme that ascendancy decomposed shortest paths around term neighbors as basic indexing unit. This newfangled index proves to be effectual in query space pruning of graph database Neo4j, scalable in index construction and deployment. A graph query is processed and optimized beyond the traditional Lucene in a time-based manner to a more efficient path method in LIndex. This advanced algorithm significantly reduces query fetch without compromising the quality of results in time. The experiments are conducted to confirm the efficiency of the proposed query retrieval in Neo4j graph NoSQL database.

A Novel Blind Watermarking Scheme Using Block Indexing (블록 인덱싱을 이용한 블라인드 워터마킹 기법)

  • Kang Hyun-Ho;Shin Sang-Uk;Han Seung-Wu
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.6
    • /
    • pp.331-342
    • /
    • 2005
  • In this paper, we propose an efficient watermarking algorithm using block indexing. The proposed algorithm is a novel blind watermarking scheme using the indexed watermark value based on the spread spectrum method. The watermark insertion is allocated into index value of each block after dividing original image into sub-blocks. The watermark embedded in mappinged with index values of blocks, And the mappinged blocks convert to DCT and then the PN sequence embedded to middle frequency band. Consequently the watermark is expressed by index value of sub-blocks. The watermark extracted from the correlation of between PN sequence and watermarked image. Experimental results demonstrate that the watermarked image has a good quality in terms of imperceptibility and is robust against various attacks.

  • PDF