• Title/Summary/Keyword: relevant information retrieval

Search Result 186, Processing Time 0.028 seconds

Ontology Supported Information Systems: A Review

  • Padmavathi, T.;Krishnamurthy, M.
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.4
    • /
    • pp.61-76
    • /
    • 2014
  • The exponential growth of information on the web far exceeds the capacity of present day information retrieval systems and search engines, making information integration on the web difficult. In order to overcome this, semantic web technologies were proposed by the World Wide Web Consortium (W3C) to achieve a higher degree of automation and precision in information retrieval systems. Semantic web, with its promise to deliver machine understanding to the traditional web, has attracted a significant amount of research from academia as well as from industries. Semantic web is an extension of the current web in which data can be shared and reused across the internet. RDF and ontology are two essential components of the semantic web architecture which support a common framework for data storage and representation of data semantics, respectively. Ontologies being the backbone of semantic web applications, it is more relevant to study various approaches in their application, usage, and integration into web services. In this article, an effort has been made to review the research work being undertaken in the area of design and development of ontology supported information systems. This paper also briefly explains the emerging semantic web technologies and standards.

Retrieval Effectiveness of Subject Descriptor and Citation Searching in the Water Resources Literature (수자원문헌의 주제탐색과 인용탐색의 검색효율 비교 연구)

  • Lee Myeong-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.26
    • /
    • pp.213-233
    • /
    • 1994
  • This study measured whether subject descriptor searching and citation searching retrieve different documents for conceptual queries and methodological queries in natural science, engineering and social science. The retrieval effectiveness of two search methods was measured using as criteria, total number of documents retrieved, total number of relevant documents, overlapping and unique documents and precision ratio. The search subject was water resources and the databases used were Selected Water Resources Abstracts (SWRA) and SCISEARCH. Data were collected for 21 doctoral students working on their dissertations in the three fields of water resources. Principal findings included: 1) subject searching and citation searching each retrieved substantially equal number of documents; 2) total number of relevant documents for conceptual queries was larger than that for methodological queries, while there was a large variation among the three fields; 3) the average overlap was quite small, while citation searching yielded more unique documents than subject searching; 4) for conceptual queries, citation searching yielded a higher precision ratio than subject searching, while subject searching obtained a slightly higher precision ratio than citation searching for methodological queries ; and 5) citation searching was effective for both specific queries and broad queries if seed articles are well chosen, while subject searching only worked well for broad queries. It was further found that: 1) citation searching is not a subsidiary but a substantial retrieval method in water resources; 2) SWRA is effective for queries for engineering and SCISEARCH is appropriate for queries for natural science, while neither SWRA nor SCISEARCH work well for queries for social science; and 3) characteristics of queries affect retrieval results more than the characteristics of documents or the coverage of databases.

  • PDF

Word Embeddings-Based Pseudo Relevance Feedback Using Deep Averaging Networks for Arabic Document Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah;Atwan, Jaffar
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.2
    • /
    • pp.1-17
    • /
    • 2021
  • Pseudo relevance feedback (PRF) is a powerful query expansion (QE) technique that prepares queries using the top k pseudorelevant documents and choosing expansion elements. Traditional PRF frameworks have robustly handled vocabulary mismatch corresponding to user queries and pertinent documents; nevertheless, expansion elements are chosen, disregarding similarity to the original query's elements. Word embedding (WE) schemes comprise techniques of significant interest concerning QE, that falls within the information retrieval domain. Deep averaging networks (DANs) defines a framework relying on average word presence passed through multiple linear layers. The complete query is understandably represented using the average vector comprising the query terms. The vector may be employed for determining expansion elements pertinent to the entire query. In this study, we suggest a DANs-based technique that augments PRF frameworks by integrating WE similarities to facilitate Arabic information retrieval. The technique is based on the fundamental that the top pseudo-relevant document set is assessed to determine candidate element distribution and select expansion terms appropriately, considering their similarity to the average vector representing the initial query elements. The Word2Vec model is selected for executing the experiments on a standard Arabic TREC 2001/2002 set. The majority of the evaluations indicate that the PRF implementation in the present study offers a significant performance improvement compared to that of the baseline PRF frameworks.

A Study on the Design of a Topic Map-based Retrieval System for the Academic Administration Records of Universities (대학 학사행정 기록물의 토픽맵 기반 검색시스템 설계에 관한 연구)

  • Shin, Jiyu;Jung, Youngmi
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.16 no.1
    • /
    • pp.175-193
    • /
    • 2016
  • A topic map was designed as an efficient information retrieval method that is optimized for classification, organization, and navigation through the use of a semantic link network above information resources. With this, this study aims to design a topic map-based university archives retrieval system to provide the relevant information retrieval. For this study, electronic records that relate to the academic administration within two years of D university were collected, and topic map editing was carried out with Ontopia Omnigator. Topics were classified according to their functional analysis of academic administration. In the end, the number of topics was finalized as 626, with 6 types in general: academic work, staff, college register, student, university, etc. Association was separated into six types as well, which were formed with consideration to the relationships among topics. In addition, there are seven occurrence types: register class, register number, register date, receiver, title, creator, and identifier. It is expected that the associative nature of the designed topic map-based retrieval system in this study will make navigation of large records easy and allow incidental discovery of knowledge.

Retrieval methodology for similar NPP LCO cases based on domain specific NLP

  • No Kyu Seong ;Jae Hee Lee ;Jong Beom Lee;Poong Hyun Seong
    • Nuclear Engineering and Technology
    • /
    • v.55 no.2
    • /
    • pp.421-431
    • /
    • 2023
  • Nuclear power plants (NPPs) have technical specifications (Tech Specs) to ensure that the equipment and key operating parameters necessary for the safe operation of the power plant are maintained within limiting conditions for operation (LCO) determined by a safety analysis. The LCO of Tech Specs that identify the lowest functional capability of equipment required for safe operation for a facility must be complied for the safe operation of NPP. There have been previous studies to aid in compliance with LCO relevant to rule-based expert systems; however, there is an obvious limit to expert systems for implementing the rules for many situations related to LCO. Therefore, in this study, we present a retrieval methodology for similar LCO cases in determining whether LCO is met or not met. To reflect the natural language processing of NPP features, a domain dictionary was built, and the optimal term frequency-inverse document frequency variant was selected. The retrieval performance was improved by adding a Boolean retrieval model based on terms related to the LCO in addition to the vector space model. The developed domain dictionary and retrieval methodology are expected to be exceedingly useful in determining whether LCO is met.

Obscene Material Searching Method in WWW (WWW상에서 음란물 검색기법)

  • 노경택;김경우;이기영;김규호
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.2
    • /
    • pp.1-7
    • /
    • 1999
  • World-Wide Web(WWW) is a protocol for changing information exchanges which is central to text documents in the existing network to make a multimedia data exchanges. It is possible for a beginner to search and access data which he wants to find as data were stored in the form of hypertext. The easiness for searching and accessing the multimedia data in WWW makes a important role for obscene materials to be toward generalization and multimedia and occurs social problems for them to be commercialized, while other researchers have actively studied the way to block effectively the site providing obscene materials for solving such problems. This paper presents and implements the blocking method for the sites having obscene material as it effectively search them. The proposed model was based on Link-Based information retrieval method and proved that it accomplished more efficient retrieval of relevant documents than probabilistic model when compared the one with the other which is known to generate the most correct results. The improvements in the average recall and precision ratio were shown as 12% and 8% respectively. Especially, the retrieval capability of relevant documents which include non-text data and have a few links increased highly.

  • PDF

Human factors guidelines for designing anchors in the moving pictures on multimedia systems (멀티미디어 시스템의 동영상 노드를 위한 앵커의 인간공학적 설계지침)

  • Han, Sung-H.;Kim, Mi-Jeong;Kwahk, Ji-Young
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.22 no.2
    • /
    • pp.265-276
    • /
    • 1996
  • Multimedia systems present information by various media, for example, video, sound, music, animation, movie, etc., in addition to the text which has long been used for conveying the information. Among many multimedia applications, the multimedia information retrieval systems commercialized in the forms of multimedia encyclopedia CD-ROMs, benefited from various media for their ability to present information in an efficient and complete way. But using several media, on the other hand, may cause end users' confusion and furthermore, poorly designed user interface often exacerbates the situation. In this study, the multimedia systems were studied from the standpoint of usability. The conceptual framework of the user interface of the multimedia system was newly defined. And 100 initial variables for user interface design of general multimedia systems were suggested through literature survey and expert opinions based upon the framework developed. Among various application areas, the multimedia information retrieval systems were chosen for investigation, and 36 variables particularly relevant to user interface of the multimedia information retrieval systems were selected. According to the sequential research strategy, the variables that were considered to be most important were finally selected through a screening stage. A part of selected variables were verified through a human factors experiment as the first step of sequential research. Based upon the result of the experiment, guidelines for user interface design were provided. For future study, the variables remained will be Investigated and the study will expand to another application areas.

  • PDF

Relevance Feedback using Region-of-interest in Retrieval of Satellite Images (위성영상 검색에서 사용자 관심영역을 이용한 적합성 피드백)

  • Kim, Sung-Jin;Chung, Chin-Wan;Lee, Seok-Lyong;Kim, Deok-Hwan
    • Journal of KIISE:Databases
    • /
    • v.36 no.6
    • /
    • pp.434-445
    • /
    • 2009
  • Content-based image retrieval(CBIR) is the retrieval technique which uses the contents of images. However, in contrast to text data, multimedia data are ambiguous and there is a big difference between system's low-level representation and human's high-level concept. So it doesn't always mean that near points in the vector space are similar to user. We call this the semantic-gap problem. Due to this problem, performance of image retrieval is not good. To solve this problem, the relevance feedback(RF) which uses user's feedback information is used. But existing RF doesn't consider user's region-of-interest(ROI), and therefore, irrelevant regions are used in computing new query points. Because the system doesn't know user's ROI, RF is proceeded in the image-level. We propose a new ROI RF method which guides a user to select ROI from relevant images for the retrieval of complex satellite image, and this improves the accuracy of the image retrieval by computing more accurate query points in this paper. Also we propose a pruning technique which improves the accuracy of the image retrieval by using images not selected by the user in this paper. Experiments show the efficiency of the proposed ROI RF and the pruning technique.

A Study of Designing the Intelligent Information Retrieval System by Automatic Classification Algorithm (자동분류 알고리즘을 이용한 지능형 정보검색시스템 구축에 관한 연구)

  • Seo, Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.39 no.4
    • /
    • pp.283-304
    • /
    • 2008
  • This is to develop Intelligent Retrieval System which can automatically present early query's category terms(association terms connected with knowledge structure of relevant terminology) through learning function and it changes searching form automatically and runs it with association terms. For the reason, this theoretical study of Intelligent Automatic Indexing System abstracts expert's index term through learning and clustering algorism about automatic classification, text mining(categorization), and document category representation. It also demonstrates a good capacity in the aspects of expense, time, recall ratio, and precision ratio.

  • PDF

Question Analysis and Expansion based on Semantics (의미 기반의 질의 분석 및 확장)

  • Shin, Seung-Eun;Park, Hee-Guen;Seo, Young-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.7
    • /
    • pp.50-59
    • /
    • 2007
  • This paper describes a question analysis and expansion based on semantics for on efficient information retrieval. Results of all information retrieval systems include many non-relevant documents because the index cannot naturally reflect the contents of documents and because queries used in information retrieval systems cannot represent enough information in user's question. To solve this problem, we analyze user's question semantically, determine the answer type, and extract semantic features. And then we expand user's question using them and syntactic structures which are used to represent the answer. Our similarity is to rank documents which include expanded queries in high position. Especially, we found that an efficient document retrieval is possible by a question analysis and expansion based on semantics on natural language questions which are comparatively short but fully expressing the information demand of users.