• Title/Summary/Keyword: keyword retrieval

Search Result 236, Processing Time 0.143 seconds

A Keyword Matching for the Retrieval of Low-Quality Hangul Document Images

  • Na, In-Seop;Park, Sang-Cheol;Kim, Soo-Hyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.47 no.1
    • /
    • pp.39-55
    • /
    • 2013
  • It is a difficult problem to use keyword retrieval for low-quality Korean document images because these include adjacent characters that are connected. In addition, images that are created from various fonts are likely to be distorted during acquisition. In this paper, we propose and test a keyword retrieval system, using a support vector machine (SVM) for the retrieval of low-quality Korean document images. We propose a keyword retrieval method using an SVM to discriminate the similarity between two word images. We demonstrated that the proposed keyword retrieval method is more effective than the accumulated Optical Character Recognition (OCR)-based searching method. Moreover, using the SVM is better than Bayesian decision or artificial neural network for determining the similarity of two images.

A Study on Keyword Extraction and Expansion for Web Text Retrieval (웹 문서 검색을 위한 검색어 추출과 확장에 관한 연구)

  • Yoon, Sung-Hee
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.9
    • /
    • pp.1111-1118
    • /
    • 2004
  • Natural language query is the best user interface for the users of web text retrieval systems. This paper proposes a retrieval system with expanded keyword from syntactically-analyzed structures of user's natural language query based on natural language processing technique. Through the steps combining or splitting the compound nouns based on syntactic tree traversal, and expanding the other-formed or shorten-formed keyword into multiple keyword, it shows that precision and correctness of the retrieval system was enhanced.

  • PDF

Web Information Retrieval based on Natural Language Query Analysis and Keyword Expansion (자연어 질의 분석과 검색어 확장에 기반한 웹 정보 검색)

  • 윤성희;장혜진
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.2
    • /
    • pp.235-248
    • /
    • 2004
  • For the users of information retrieval systems, natural language query is the more ideal interface, compared with keyword and boolean expressions. This paper proposes a retrieval technique with expanded keyword from syntactically-analyzed structures of natural language query as user input. Through the steps combining or splitting the compound nouns based on syntactic tree traversal of the query, and expanding the other-formed or shorten-formed into multiple keyword, it can enhance the precision and correctness of the retrieval system.

SPARQL Query Automatic Transformation Method based on Keyword History Ontology for Semantic Information Retrieval

  • Jo, Dae Woong;Kim, Myung Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.2
    • /
    • pp.97-104
    • /
    • 2017
  • In semantic information retrieval, we first need to build domain ontology and second, we need to convert the users' search keywords into a standard query such as SPARQL. In this paper, we propose a method that can automatically convert the users' search keywords into the SPARQL queries. Furthermore, our method can ensure effective performance in a specific domain such as law. Our method constructs the keyword history ontology by associating each keyword with a series of information when there are multiple keywords. The constructed ontology will convert keyword history ontology into SPARQL query. The automatic transformation method of SPARQL query proposed in the paper is converted into the query statement that is deemed the most appropriate by the user's intended keywords. Our study is based on the existing legal ontology constructions that supplement and reconstruct schema and use it as experiment. In addition, design and implementation of a semantic search tool based on legal domain and conduct experiments. Based on the method proposed in this paper, the semantic information retrieval based on the keyword is made possible in a legal domain. And, such a method can be applied to the other domains.

Syntactic Analysis and Keyword Expansion for Performance Enhancement of Information Retrieval System (정보 검색 시스템의 성능 향상을 위한 구문 분석과 검색어 확장)

  • 윤성희
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.5 no.4
    • /
    • pp.303-308
    • /
    • 2004
  • Natural language query is the best user interface for the users of information retrieval systems. This paper Proposes a retrieval system with expanded keyword from syntactically-analyzed structures of user's natural language query based on natural language processing technique. Through the steps combining or splitting the compound nouns based on syntactic tree traversal, and expanding the other-formed or shorten-formed keyword into multiple keyword, the system performance was enhanced up to 11.3% precision and 4.7% correctness.

  • PDF

A Study on Natural Language Keyword Indexing for Web-based Information Retrieval (웹기반 정보검색을 위한 자연어 키워드 색인에 관한 연구)

  • 윤성희
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.12
    • /
    • pp.1103-1111
    • /
    • 2003
  • Information retrieval system with indexing system matching single keyword is simple and popular. But with single keyword matching it is very hard to represent the exact meaning of documents and the set of documents from retrieval is very large, therefore it can't satisfy the user of the information retrieval systems. This paper proposes a phrase-based indexing system based on the phrase, the larger syntax unit than a single keyword. Web documents include lots of syntactic errors, the natural language parser with high Quality cannot be expected in Web. Partial trees, even not a full tree, from fully bottom-up parsing is still useful for extracting phrases, and they are much more discriminative than single keyword for index. It helps the information retrieval system enhance the efficiency and reduce the processing overhead.

  • PDF

Phrase-based Indexing for Korean Information Retrieval System (한국어 정보검색 시스템을 위한 구 단위 색인)

  • 윤성희
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.5 no.1
    • /
    • pp.44-48
    • /
    • 2004
  • This paper proposes a phrase-based indexing system based on the phrase. the larger syntax unit than a single keyword. Early information retrieval systems with indexing system matching single keyword is simple and popular. But with single keyword matching it is very hard to represent the exact meaning of documents and the set of documents from retrieval is very large, therefore it can't satisfy the user of the information retrieval systems. Web documents include lots of syntactic errors, the natural language parser with high quality cannot be expected in Web. Partial trees, even not a full tree, from fully bottom-up parsing is still useful for extracting phrases, and they are much more discriminative than single keyword for index. It helps the information retrieval system enhance the efficiency and reduce the processing overhead, too.

  • PDF

A Study on the Multiple Keyword Retrieval Method under the Object-Oriented Multimedia Database Model (객체 지향 멀티미디어 데이터베이스 모델하에서의 다중 키워드 검색 기법에 관한 연구)

  • 석상기;김경창;김기용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.8
    • /
    • pp.1176-1189
    • /
    • 1993
  • This paper presents the Multiple Keyword Retrieval Method under the Object-Oriented Multimedia Database Model. The multiple keyword registration and retrieval algorithms are developed to reduce the partial matching problem in multimedia data retrieval. For this, proper storage structures of the lookup tables are designed. And also, in order to maintain the constant retrieval time, media data files are organized with B+ tree structure.

  • PDF

Automatic In-Text Keyword Tagging based on Information Retrieval

  • Kim, Jin-Suk;Jin, Du-Seok;Kim, Kwang-Young;Choe, Ho-Seop
    • Journal of Information Processing Systems
    • /
    • v.5 no.3
    • /
    • pp.159-166
    • /
    • 2009
  • As shown in Wikipedia, tagging or cross-linking through major keywords in a document collection improves not only the readability of documents but also responsive and adaptive navigation among related documents. In recent years, the Semantic Web has increased the importance of social tagging as a key feature of the Web 2.0 and, as its crucial phenotype, Tag Cloud has emerged to the public. In this paper we provide an efficient method of automated in-text keyword tagging based on large-scale controlled term collection or keyword dictionary, where the computational complexity of O(mN) - if a pattern matching algorithm is used - can be reduced to O(mlogN) - if an Information Retrieval technique is adopted - while m is the length of target document and N is the total number of candidate terms to be tagged. The result shows that automatic in-text tagging with keywords filtered by Information Retrieval speeds up to about 6 $\sim$ 40 times compared with the fastest pattern matching algorithm.

A Study on the Performance Evaluation of Semantic Retrieval Engines (시맨틱검색엔진의 성능평가에 관한 연구)

  • Noh, Young-Hee
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.22 no.2
    • /
    • pp.141-160
    • /
    • 2011
  • This study suggested knowledge base and search engine for the libraries that have the largescaled data. For this purpose, 3 components of knowledge bases(triple ontology, concept-based knowledge base, inverted file) were constructed and 3 search engines(search engine JENA for rule-based reasoning, Concept-based search engine, keyword-based Lucene retrieval engine) were implemented to measure their performance. As a result, concept-based retrieval engine showed the best performance, followed by ontology-based Jena retrieval engine, and then by a normal keyword search engine.