• Title/Summary/Keyword: Text Index

Search Result 268, Processing Time 0.025 seconds

An Embedded Text Index System for Mass Flash Memory (대용량 플래시 메모리를 위한 임베디드 텍스트 인덱스 시스템)

  • Yun, Sang-Hun;Cho, Haeng-Rae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.6
    • /
    • pp.1-10
    • /
    • 2009
  • Flash memory has the advantages of nonvolatile, low power consumption, light weight, and high endurance. This enables the flash memory to be utilized as a storage of mobile computing device such as PMP(Portable Multimedia Player). Potable device with a mass flash memory can store various multimedia data such as video, audio, or image. Typical index systems for mobile computer are inefficient to search a form of text like lyric or title. In this paper, we propose a new text index system, named EMTEX(Embedded Text Index). EMTEX has the following salient features. First, it uses a compression algorithm for embedded system. Second, if a new insert or delete operation is executed on the base table. EMTEX updates the text index immediately. Third, EMTEX considers the characteristics of flash memory to design insert, delete, and rebuild operations on the text index. Finally, EMTEX is executed as an upper layer of DBMS. Therefore, it is independent of the underlying DBMS. We evaluate the performance of EMTEX. The Experiment results show that EMTEX can outperform th conventional index systems such as Oracle Text and FT3.

A Study on the Index Model for Secondary Legal Information Databases (법률정보시스템의 색인에 관한 연구 -특히 2차 법률정보를 중심으로-)

  • 노정란
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.8 no.1
    • /
    • pp.117-134
    • /
    • 1997
  • This study proves that the quoted legal text functions as the index which represents the contents of the text because of the characteristics of legal information, the automatic indexing in the secondary legal full-text databases can be possible without the assitance of the experts. In case of the establishment, amendment or repealing of law, change of words of index can be possible through revising the legal text quoted in the secondary legal full-text databases. Even when we dont input the full-text about retrospective documents, automatic indexing is also possible, and the establihment and the practice of expert knowledge and integrated databases are possible in case of the retrospective documents. This study indicates that it is necessary to have characteristic information the information experts recognize - that is to say, experimental and inherent knowledge only human being can have - built-in into the system rather than to approach the information system by the linguistic, statistic or structuralistic way, and it can be more essential and intelligent information system.

  • PDF

A Study on the DDC Index (DDC 색인에 대한 연구)

  • Nam, Tae-Woo
    • Journal of Korean Library and Information Science Society
    • /
    • v.41 no.3
    • /
    • pp.155-183
    • /
    • 2010
  • A book index is a locater system that ordinarily connects a set of terms from a text of a book to the page where they occur in the book's text. The DDC Relative Index differs somewhat in both of this matters. Its terms refers to classification notations and their corresponding category statements as found in the schedule text rather than to page numbers. The index is the final equipment of a classification scheme. The index is of primary importance to any classification scheme. Therefore The purpose of this study is to analysis DDC Relative Index.

  • PDF

A Study on the DB-IR Integration: Per-Document Basis Online Index Maintenance

  • Jin, Du-Seok;Jung, Hoe-Kyung
    • Journal of information and communication convergence engineering
    • /
    • v.7 no.3
    • /
    • pp.275-280
    • /
    • 2009
  • While database(DB) and information retrieval(IR) have been developed independently, there have been emerging requirements that both data management and efficient text retrieval should be supported simultaneously in an information system such as health care, customer support, XML data management, and digital libraries. The great divide between DB and IR has caused different manners in index maintenance for newly arriving documents. While DB has extended its SQL layer to cope with text fields due to lack of intact mechanism to build IR-like index, IR usually treats a block of new documents as a logical unit of index maintenance since it has no concept of integrity constraint. However, In the DB-IR integrations, a transaction on adding or updating a document should include maintenance of the posting lists accompanied by the document. Although DB-IR integration has been budded in the research filed, the issue will remain difficult and rewarding areas for a while. One of the primary reasons is lack of efficient online transactional index maintenance. In this paper, performance of a few strategies for per-document basis transactional index maintenance - direct index update, pulsing auxiliary index and posting segmentation index - will be evaluated. The result shows that the pulsing auxiliary strategy and posting segmentation indexing scheme, can be a challenging candidates for text field indexing in DB-IR integration.

A Study on the Extraction and Utilization of Index from Bibliographic MARC Database (서지마크 데이터베이스로부터의 색인어 추출과 색인어의 검색 활용에 관한 연구 - 경북대학교 도서관 학술정보시스템 사례를 중심으로 -)

  • Park Mi-Sung
    • Journal of Korean Library and Information Science Society
    • /
    • v.36 no.2
    • /
    • pp.327-348
    • /
    • 2005
  • The purpose of this study is to emphasize the importance of index definition and to prepare the basis of optimal index in bibliographic retrieval system. For the purpose, this research studied a index extraction theory on index tag definition and index normalization from the bibliographic marc database and analyzed a retrieval utilization rate of extracted index. In this experiment, we divided index between text-type and code-type about the generated 29,219,853 indexes from 2,200,488 bibliographic records and analyzed utilization rate by the comparison of index-type and index term of web logs. According to the result, the text-type indexes such as title, author, publication, subject are showed high utilization rate while the code-type indexes were showed low utilization rate. So this study suggests that the unused index is removed from index definition to optimize index.

  • PDF

An Efficient Block Index Scheme with Segmentation for Spatio-Textual Similarity Join

  • Xiang, Yiming;Zhuang, Yi;Jiang, Nan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.7
    • /
    • pp.3578-3593
    • /
    • 2017
  • Given two collections of objects that carry both spatial and textual information in the form of tags, a $\text\underline{S}patio$-$\text\underline{T}extual$-based object $\text\underline{S}imilarity$ $\text\underline{JOIN}$ (ST-SJOIN) retrieves the pairs of objects that are textually similar and spatially close. In this paper, we have proposed a block index-based approach called BIST-JOIN to facilitate the efficient ST-SJOIN processing. In this approach, a dual-feature distance plane (DFDP) is first partitioned into some blocks based on four segmentation schemes, and the ST-SJOIN is then transformed into searching the object pairs falling in some affected blocks in the DFDP. Extensive experiments on real and synthetic datasets demonstrate that our proposed join method outperforms the state-of-the-art solutions.

A Symmetric Key Cryptography Algorithm by Using 3-Dimensional Matrix of Magic Squares

  • Lee, Sangho;Kim, Shiho;Jung, Kwangho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.768-770
    • /
    • 2013
  • We propose a symmetric key based cryptography algorithm to encode and decode the text data with limited length using 3-dimensional magic square matrix. To encode the plain text message, input text will be translated into an index of the number stored in the key matrix. Then, Caesar's shift with pre-defined constant value is fabricated to finalize an encryption algorithm. In decode process, Caesar's shift is applied first, and the generated key matrix is used with 2D magic squares to replace the index numbers in ciphertext to restore an original text.

The Extraction of Effective Index Database from Voice Database and Information Retrieval (음성 데이터베이스로부터의 효율적인 색인데이터베이스 구축과 정보검색)

  • Park Mi-Sung
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.3
    • /
    • pp.271-291
    • /
    • 2004
  • Such information services source like digital library has been asked information services of atypical multimedia database like image, voice, VOD/AOD. Examined in this study are suggestions such as word-phrase generator, syllable recoverer, morphological analyzer, corrector for voice processing. Suggested voice processing technique transform voice database into tort database, then extract index database from text database. On top of this, the study suggest a information retrieval model to use in extracted index database, voice full-text information retrieval.

  • PDF

A Study on Classifications of Useful Customer Reviews by Applying Text Mining Approach (텍스트 마이닝을 활용한 고객 리뷰의 유용성 지수 개선에 관한 연구)

  • Lee, Hong Joo
    • Journal of Information Technology Services
    • /
    • v.14 no.4
    • /
    • pp.159-169
    • /
    • 2015
  • Customer reviews are one of the important sources for purchase decision makings in online stores. Online stores have tried to provide useful reviews in product pages to customers. To assess the usefulness of customer reviews before other users have voted enough on the reviews, diverse aspects of reviews were utilized in prevous studies. Style and semantic information were utilized in many studies. This study aims to test diverse alogrithms and datasets for identifying a proper classification method and threshold to classify useful reviews. In particular, most researches utilized ratio type helpfulness index as Amazon.com used. However, there is another type of usefulness index utilized in TripAdviser.com or Yelp.com, count type helpfulness index. There was no proper threshold to classify useful reviews yet for count type helpfulness index. This study used reivews and their usefulness votes on restaurnats from Yelp.com to devise diverse datasets and applied text mining approaches to classify useful reviews. Random Forest, SVM, and GLMNET showed the greater values of accuracy than other approaches.

- For the Development of Inquiring, integrated Science Curricular Materials - The Comparison and Analysis of Inquiry Activity between "The FAST Program" and "The Secondary Science Books" (탐구적 통합 과학 교재 개발을 위한, "FAST program"과 "중등 과학 교과서"의 탐구 활동 비교 분석)

  • Son, Yeon-A;Lee, Hack-Dong
    • Journal of The Korean Association For Science Education
    • /
    • v.14 no.1
    • /
    • pp.45-57
    • /
    • 1994
  • The purpose of this study is to verify whether the FAST program is the Inquiry Science Curricular Materials, through the Comparison and Analysis of Inquiry Activities between the FAST program and our Secondary Science Books. The results of this study are as follows ; 1. FAST has 226 tasks of the Inquiry Activities, which is analyzed over two times than our text. 2. In level one, FAST holds the parts of Synthesizing Results and Evaluation, Hypothesizing and Designing an Experiment but u.ese aren't found in our text. 3. In level two, our text is analyzed No Discussion 72.2%, Demonstrating or Verifying the Content of the Text 82%, but FAST has Discussion Guided 81.8%, and isn't found any tesk of Demonstrating or Verifying the Content of the text. 4. In level three, our text is exposed a typical type I and analyzed Inquiry Index 15-25 ( Middle ), but FAST is found type IV, excepting Manipulating Apparatus and Observation and analyzed Inquiry Index over 35 ( Very - High ). Therefore, FAST Program is proved to be the desirable Inquiry Science Curricular Materials. In future, this worker is to arrange the results of the following paper as follows ; 1. The verification of the FAST Program by means of the Integrated Science Curricular Materials. 2. The development of the Inquiring, Integrated Science Curricular Materials through the results of the preceding study.

  • PDF