• Title/Summary/Keyword: full-text retrieval system

Search Result 30, Processing Time 0.022 seconds

Future and Directions for Research in Full Text Databases (본문 데이타베이스 연구에 관한 고찰과 그 전망)

  • Ro Jung Soon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.17
    • /
    • pp.49-83
    • /
    • 1989
  • A Full text retrieval system is a natural language document retrieval system in which the full text of all documents in a collection is stored on a computer so that every word in every sentence of every document can be located by the machine. This kind of IR System is recently becoming rapidly available online in the field of legal, newspaper, journal and reference book indexing. Increased research interest has been in this field. In this paper, research on full text databases and retrieval systems are reviewed, directions for research in this field are speculated, questions in the field that need answering are considered, and variables affecting online full text retrieval and various role that variables play in a research study are described. Two obvious research questions in full text retrieval have been how full text retrieval performs and how to improve the retrieval performance of full text databases. Research to improve the retrieval performance has been incorporated with ranking or weighting algorithms based on word occurrences, combined menu-driven and query-driven systems, and improvement of computer architectures and record structure for databases. Recent increase in the number of full text databases with various sizes, forms and subject matters, and recent development in computer architecture artificial intelligence, and videodisc technology promise new direction of its research and scholarly growth. Studies on the interrelationship between every elements of the full text retrieval situation and the relationship between each elements and retrieval performance may give a professional view in theory and practice of full text retrieval.

  • PDF

Variations in relevance assessments and evaluation of the performance of full-text retrieval system (상이한 적합성 판정과 전문검색시스템의 평가에 관한 연구)

  • 문성빈
    • Journal of the Korean Society for information Management
    • /
    • v.14 no.2
    • /
    • pp.123-141
    • /
    • 1997
  • This study examined the extent to which variations in relevance assessments affect the evaluation of the performance of full-text retrieval system. Four sets of relevance judgments obtained by examining the full-text of documents were used to test the retrieval effectiveness. There was no noticeable difference in retrieval performance among the four relevance judgment sets. It implies that a variety of definitions of relevance has no effect on the evaluation of the performance of the full-text retrieval system. Furth r retrieval experiments on this topic incorporating relevance feedback, which is one of the sophisticated retrieval techniques using relevance information, are suggested.

  • PDF

On the Characteristics and Information Retrieval Performance of Full-Text Databases (전문데이터베이스의 특성과 정보검색성능)

  • Cho Myung-Hi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.17
    • /
    • pp.339-366
    • /
    • 1989
  • Appearance of full-text online is the most encouraging phenomenon ·during the development of databases. The full-text databases of today is derived from by-product of electronic publication of printed materials. Now, there are also some movements toward electronic production of documents in Korea although not powerful. The present study is designed to examine the characteristics and effective retrieval method of full-text databases now commercially available through various vendors. The outline of this paper IS as follows: First, background and present situation of existing full-text database services through national and worldwide are examined. Second, free-text searching system of full-text databases is compared with controlled vocabulary system. The factors influencing on free-text retrieval performance, searching thesaurus, and hybrid or compromising system, which is using limited controlled vocabulary in conjunction with natural language for the enrichment needed for practical operation of the . system, are examined. Third, user demands through the analysis of preceding studies on 'various types of full-text databases are recognised. Fouth, application of CD-ROM full-text database to the libraries and information centers is examined as prospective resources for them. Finally, some problems and prospect of full-text databases are presented.

  • PDF

A Hangul Document Image Retrieval System Using Rank-based Recognition (웨이브렛 특징과 순위 기반 인식을 이용한 한글 문서 영상 검색 시스템)

  • Lee Duk-Ryong;Kim Woo-Youn;Oh Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.2
    • /
    • pp.229-242
    • /
    • 2005
  • We constructed a full-text retrieval system for the scanned Hangul document images. The system consists of three parts; preprocessing, recognition, and retrieval components. The retrieval algorithm uses recognition results up to k-ranks. The algorithm is not only insensitive to the recognition errors, but also has the advantage of user-controllable recall and precision. For the objective performance evaluation, we used the scanned images of the Journal of Korea Information Science Society provided by KISTI. The system was shown to be practical through theevaluationofrecognitionandretrievalrates.

  • PDF

A Study on the Utility of Relevance/Non-relevance Information in Homogeneous Documents (유사문헌집단에서 적합/부적합정보의 유용성에 관한 연구)

  • Moon, Sung-Been
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.3
    • /
    • pp.277-293
    • /
    • 2015
  • This study examined the relative retrieval effectiveness after relevance feedback between two systems (Title/Abstract and Full-text) using four different sets of relevance judgment. Four relevance levels (not relevant, marginally relevant, relevant, highly relevant) are also used, each of which is determined by referees giving a relevance score to documents. This study also investigated how much the average precision was improved after relevance feedback when "marginally relevant" documents are included in the relevant class with the Title/Abstract system, and with the Full-text retrieval system as well. It is found that the Title/Abstract system benefited from relevance feedback with the marginally relevant documents. In case of the Title/Abstract system, the higher percentage of improvement was consistently obtained when including the marginally relevant documents in the relevance class, however the result was vice versa in case of the Full-text retrieval system. It implied that the marginally relevant documents in the relevant class had caused noises in the Full-text retrieval system.

Implementation of Information Retrieval System for Full-Text (전문에 대한 검색시스템의 구현)

  • 김대규;정희택;강영만;한순희;조혁현
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2000.10a
    • /
    • pp.337-340
    • /
    • 2000
  • Using the Information Retrieval systems on the Internet, the demand of exact and specific information has also been popularized. To offer exact information, there k3 been generalized demand of searching from the keyword of the shortened text and also of the full-text. This study is to suggest a scheme for full-text searches. It is to compare the existing scheme of information search and full-text information search with interMedia text. We suggest search methods for the full-text.

  • PDF

A Study on the Implementation and Performance Evaluation of Full-text Information Retrieval System based on Scientific Paper′s Content Structure (학술논문의 내용구조에 의한 전문검색시스템 구현과 성능평가에 관한 연구)

  • 이두영;이병기
    • Journal of the Korean Society for information Management
    • /
    • v.15 no.3
    • /
    • pp.73-93
    • /
    • 1998
  • Conventional full-text information retrieval system has been proved with high recall ratio and low precision ratio. One of the disadvantages of full-text IR system is that it is not designed to reflect the user's information need. It is due to the fact that full-text IR system has been designed based on physical and logical structure of document without considering the content of document. The purpose of the study is to develop more effective full-text IR system by resolving such disadvantages of conventional system. The study has developed new method of designing full-text IR system by using Content Structure Markup Language(CSML) other than conventioanal SGML.

  • PDF

Application of the 2-Poisson Model to Full-Text Information Retrieval System (2-포아송 모형의 전문검색시스템 응용에 관한 연구)

  • 문성빈
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.3
    • /
    • pp.49-63
    • /
    • 1999
  • The purpose of this study is to investigate whether the terms in queries are distributed according to the 2-Poisson model in the documents represented by abstract/title or full-text. In this study, retrieval experiments using Binary independence and 2-Poisson independence model, which are based on the probabilistic theory, were conducted to see if the 2-Poisson distribution of the query terms has an influence on the retrieval effectiveness, particularly of full-text information retrieval system.

  • PDF

Consideration of a Robust Search Methodology that could be used in Full-Text Information Retrieval Systems (퍼지 논리를 이용한 사용자 중심적인 Full-Text 검색방법에 관한 연구)

  • Lee, Won-Bu
    • Asia pacific journal of information systems
    • /
    • v.1 no.1
    • /
    • pp.87-101
    • /
    • 1991
  • The primary purpose of this study was to investigate a robust search methodology that could be used in full-text information retrieval systems. A robust search methodology is one that can be easily used by a variety of users (particularly naive users) and it will give them comparable search performance regardless of their different expertise or interests In order to develop a possibly robust search methodology, a fully functional prototype of a fuzzy knowledge based information retrieval system was developed. Also, an experiment that used this prototype information retreival system was designed to investigate the performance of that search methodology over a small exploratory sample of user queries To probe the relatonships between the possibly robust search performance and the query organization using fuzzy inference logic, the search performance of a shallow query structure was analyzes. Consequently the following several noteworthy findings were obtained: 1) the hierachical(tree type) query structure might be a better query organization than the linear type query structure 2) comparing with the complex tree query structure, the simple tree query structure that has at most three levels of query might provide better search performance 3) the fuzzy search methodology that employs a proper levels of cut-off value might provide more efficient search performance than the boolean search methodology. Even though findings could not be statistically verified because the experiments were done using a single replication, it is worth noting however, that the research findings provided valuable information for developing a possibly robust search methodology in full-text information retrieval.

  • PDF

A Study on the Feasibility of Full-Text Information Retrieval System Based on Document Content Structure (문헌의 내용단위구조에 의한 전문검색시스템의 타당성 고찰)

  • Lee Byeong-Ki
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.1
    • /
    • pp.129-154
    • /
    • 1998
  • In these days the online full-text database are increasing, but conventional full-text information retrieval system has been proved with high recall ratio and low precision ratio. One of the disadvantages of full-text IR system is that it is not designed to reflect the user's information need it is due to the fact that full-text IR system has been designed based on physical and logical structure of document without considering the content of document. Therefore, the purpose of the study examined feasibility of document content structure in full-text IR system by resolving such disadvantages of conventional system. 180 Journal articles have been analyzed to find common structure of document content and finally general model of the structure of journal articles were developed. The result shows that have relation to between user's cogntive schema structure, user's information need and contents structure of document. Thus it is concluded that full-text IR system need to be designed by using document content structure in order to meet user's information need more effectively.

  • PDF