• Title/Summary/Keyword: Measures of Retrieval Effectiveness

Search Result 12, Processing Time 0.024 seconds

A Study on measuring techniques of retrieval effectiveness (검색효율 측정척도에 관한 연구)

  • Yoon Koo Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.16
    • /
    • pp.177-205
    • /
    • 1989
  • Retrieval effectiveness is the principal criteria for measuring the performance of an information retrieval system. This paper deals with the characteristics of 'relevance' of information and various measuring techniques of retrieval effectivess. The outlines of this study are as follows: 1) Relevance decision for evaluation should be devided into the user-oriented and the system-oriented decisions. 2) The recall-precision measure seems to be user-oriented, and the recall-fallout measure to be system-oriented. 3) Many of composite measures can not be justified III any rational manner unfortunately. 4) The Swets model has demonstrated that it yields, in general, a straight line instead of a curve of varying curvature and emphasized the fundamentally probabilistic nature of information retrieval. 5) The Cooper model seems to be a good substitute for precision and a useful measure for systems which ranked documents. 6) The Rocchio model were proposed for the evaluation of retreval systems which ranked documents, and were designed to be independent of cut-off. 7) The Cawkell model suggested that the Shannon's equation for entropy can be applied to measuring of retrieval effectiveness.

  • PDF

A Study on the Effectiveness of Information Retrieval (정보검색효율에 관한 연구)

  • Yoon Koo-ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.8
    • /
    • pp.73-101
    • /
    • 1981
  • Retrieval effectiveness is the principal criterion for measuring the performance of an information retrieval system. The effectiveness of a retrieval system depends primarily on the extent to which it can retrieve wanted documents without retrieving unwanted ones. So, ultimately, effectiveness is a function of the relevant and nonrelevant documents retrieved. Consequently, 'relevance' of information to the user's request has become one of the most fundamental concept encountered in the theory of information retrieval. Although there is at present no consensus as to how this notion should be defined, relevance has been widely used as a meaningful quantity and an adequate criterion for measures of the evaluation of retrieval effectiveness. The recall and precision among various parameters based on the 'two-by-two' table (or, contingency table) were major considerations in this paper, because it is assumed that recall and precision are sufficient for the measurement of effectiveness. Accordingly, different concepts of 'relevance' and 'pertinence' of documents to user requests and their proper usages were investigated even though the two terms have unfortunately been used rather loosely in the literature. In addition, a number of variables affecting the recall and precision values were discussed. Some conclusions derived from this study are as follows: Any notion of retrieval effectiveness is based on 'relevance' which itself is extremely difficult to define. Recall and precision are valuable concepts in the study of any information retrieval system. They are, however, not the only criteria by which a system may be judged. The recall-precision curve represents the average performance of any given system, and this may vary quite considerably in particular situations. Therefore, it is possible to some extent to vary the indexing policy, the indexing policy, the indexing language, or the search methodology to improve the performance of the system in terms of recall and precision. The 'inverse relationship' between average recall and precision could be accepted as the 'fundamental law of retrieval', and it should certainly be used as an aid to evaluation. Finally, there is a limit to the performance(in terms of effectiveness) achievable by an information retrieval system. That is : "Perfect retrieval is impossible."

  • PDF

Developing and Evaluating an Ontology-based Legal Retrieval System (온톨로지 기반 법률 검색시스템의 구축 및 평가에 관한 연구)

  • Chang, In-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.45 no.2
    • /
    • pp.345-366
    • /
    • 2011
  • The law affects our daily lives, and hence, constitutes a crucial information resource. However, electronic access to legal information using keyword-based retrieval systems appears to provide users with limited satisfaction. There are many factors behind this inadequacy. First, the discrepancies between formal legal terms and their counterparts in common language are quite large. Second, the situation is further confounded by frequent abbreviations in legal terms. Third, even though there is a constant deluge of legal information, users' needs have evolved to demand more Q and A type searches. All of these factors make the existing retrieval systems inefficient and ineffective. This article suggests an ontology-based system as a means to deal with such difficulties. To that end, a legal retrieval system(experimental system), built on the basis of a newly-constructed law ontology, was tested against a keyword-based legal retrieval system(existing one), yielding data on their relative effectiveness in retrieval and user satisfaction.

A New Class of Similarity Measures for Fuzzy Sets

  • Omran Saleh;Hassaballah M.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.2
    • /
    • pp.100-104
    • /
    • 2006
  • Fuzzy techniques can be applied in many domains of computer vision community. The definition of an adequate similarity measure for measuring the similarity between fuzzy sets is of great importance in the field of image processing, image retrieval and pattern recognition. This paper proposes a new class of the similarity measures. The properties, sensitivity and effectiveness of the proposed measures are investigated and tested on real data. Experimental results show that these similarity measures can provide a useful way for measuring the similarity between fuzzy sets.

A Novel Measure for Retrieval Efficiency of Image Database Retrieval System (영상 데이터베이스 검색 시스템의 검색효율 평가를 위한 새로운 평가척도)

  • 서창덕;김회율
    • Journal of Broadcast Engineering
    • /
    • v.5 no.1
    • /
    • pp.68-81
    • /
    • 2000
  • This paper proposes a single metric to measure and evaluate the retrieval effectiveness of image database retrieval system that requires an ordered ranking. There are four conditions to be a good ranking system. First, the number of relevant images among the retrieved should be as large as possible. Secondly, the number of irrelevant images should be smaller. Third, the average rank of relevant images should be higher. Last, the relevant images should be clustered close together. The conventional evaluation measures only reflect a part of the conditions listed above, and the evaluated results are coarse or inaccurate. The proposed NDS, however, resolves all those problems. In order to prove the efficiency of the NDS, we generate patterns of ${\_nC_r(_10C_5=252, _20C_9=167,960)}$ to evaluate and compare with other measures. The patterns were generated automatically by a recursive function call on the assumption the 'r' relevant images are retrieved within the range of 'n'.

  • PDF

A Comparative Study of WWW Search Engine Performance (WWW 탐색도구의 색인 및 탐색 기능 평가에 관한 연구)

  • Chung Young-Mee;Kim Seong-Eun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.31 no.1
    • /
    • pp.153-184
    • /
    • 1997
  • The importance of WWW search services is increasing as Internet information resources explode. An evaluation of current 9 search services was first conducted by comparing descriptively the features concerning indexing, searching, and ranking of search results. Secondly, a couple of search queries were used to evaluate search performance of those services by the measures of retrieval effectiveness. the degree of overlap in searching sites, and the degree of similarity between services. In this experiment, Alta Vista, HotBot and Open Text Index showed better results for the retrieval effectiveness. The level of similarity among the 9 search services was extremely low.

  • PDF

The Development of an Automatic Indexing System based on a Thesaurus (시소러스를 기반으로 하는 자동색인 시스템에 관한 연구)

  • 임형묵;정상철
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.1
    • /
    • pp.213-242
    • /
    • 1993
  • During the past decades,several automatic indexing systems have been developed such as single term indexing.phrase indexing and thesaurus basedidndexing systems.Among these systems,single term indexing has been known as superior to others despte its simpicity of extracting meaningful terms.On the other hand,thesaurus based one has been conceived as producing low retrival rate ,mainly because thesauri do not usually have enough index terms.so that much of text data fail to be indexed if they do not match with any of index terms in thesauri.This paper develops a thesaurus based indexing system THINS that yields higher retrieval rate than other systems.by doing syntactic analysis of text data and matching them with index terms in thesauri partially.First,the system analyzes the input text syntactically by using the machine translation suystem MATES/EK and extracts noun phrases.After deleting stop words from noun phrases and stemming the remaining ones.it tries to index these with similar index terms in the thesaurus as much as possible. We conduct an experiment with CACM data set that measures the retrieval effectiveness with CACM data set that measures the retrieval effectuvenss of THINS with single term based one under HYKIS-a thesaurus based information retrieval system.It turns out that THINS yields about 10 percent higher precision than single term based one.while shows 8to9 percent lower recall.This retrieval rate shows that THINS improves much better than privious ones that only yields 25 or 30 percent lower precision than single term based one.We also argue that the relatively lower recall is cause by that CRCS-the thesaurus included in CACM datea set is very incomplete one,having only more than one thousand terms,thus THINS is expected to produce much higher rate if it is associated with currently available large thesaurus.

A Similarity Ranking Algorithm for Image Databases (이미지 데이터베이스 유사도 순위 매김 알고리즘)

  • Cha, Guang-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.5
    • /
    • pp.366-373
    • /
    • 2009
  • In this paper, we propose a similarity search algorithm for image databases. One of the central problems regarding content-based image retrieval (CBIR) is the semantic gap between the low-level features computed automatically from images and the human interpretation of image content. Many search algorithms used in CBIR have used the Minkowski metric (or $L_p$-norm) to measure similarity between image pairs. However those functions cannot adequately capture the aspects of the characteristics of the human visual system as well as the nonlinear relationships in contextual information. Our new search algorithm tackles this problem by employing new similarity measures and ranking strategies that reflect the nonlinearity of human perception and contextual information. Our search algorithm yields superior experimental results on a real handwritten digit image database and demonstrates its effectiveness.

Effectiveness of Worksite Intervention on Stress Management: An Analytic Literature Review

  • Park Kyoung-Ok
    • Korean Journal of Health Education and Promotion
    • /
    • v.21 no.4
    • /
    • pp.15-33
    • /
    • 2004
  • With growing significance of psychological well-being in the worksite, the purpose of this analysis was to overview the empirical studies on worksite stress management and to identity the overall effect of worksite health promotion programs on stress management through meta-analysis. Literature retrieval was conducted on-line first in MEDLINE, EBSCOhost Academic Search Premier, and PSYCHINFO databases in public health, psychology, sociology, and human resource management areas. All studies written in English and published in the peer-reviewed journals during 1990 and 2002 were recruited. Key words used in literature retrieval were 'worksite,' 'intervention,' 'program,' 'work stress,' 'strain,' 'burnout,' 'management,' 'prevention,' 'education,' and 'health promotion.' A total of 18 worksite intervention studies with 48 effect sizes were analyzed and the results were as follows. Approximately 60% of the studies had quasi-experimental design and were conducted in manufacturing company and public sector. General psychological strains and burnout were frequently used measures of psychological stress. The lecturing and discussion typed intervention and the participatory problem-solving typed intervention were employed more than others in the studies. The average effect (r: pearson's simple correlation coefficient) weighted by sampling error was -0.14 (-0.32 to 0.05). In the conventional category of effects this is a small effect ranging from -0.59 to 0.05. Binomial effect size showed that success rates increased from 43% without intervention to 57% after an intervention. Sampling error explained 47.14% of the observed variance and its effectiveness on stress management were heterogeneous. In regression analysis with suspected moderating factors affecting the worksite interventions, research design was the only significant moderating factor. The studies with quasi-experimental design had greater effects than the studies with experimental design.

Query by Visual Example: A Comparative Study of the Efficacy of Image Query Paradigms in Supporting Visual Information Retrieval (시각 예제에 의한 질의: 시각정보 검색지원을 위한 이미지 질의 패러다임의 유용성 비교 연구)

  • Venters, Colin C.
    • Journal of Information Management
    • /
    • v.42 no.3
    • /
    • pp.71-94
    • /
    • 2011
  • Query by visual example is the principal query paradigm for expressing queries in a content-based image retrieval environment. Query by image and query by sketch have long been purported as being viable methods of query formulation yet there is little empirical evidence to support their efficacy in facilitating query formulation. The ability of the searcher to express their information problem to an information retrieval system is fundamental to the retrieval process. The aim of this research was to investigate the query by image and query by sketch methods in supporting a range of information problems through a usability experiment in order to contribute to the gap in knowledge regarding the relationship between searchers' information problems and the query methods required to support efficient and effective visual query formulation. The results of the experiment suggest that query by image is a viable approach to visual query formulation. In contrast, the results strongly suggest that there is a significant mismatch between the searchers information problems and the expressive power of the query by sketch paradigm in supporting visual query formulation. The results of a usability experiment focusing on efficiency (time), effectiveness (errors) and user satisfaction show that there was a significant difference, p<0.001, between the two query methods on all three measures: time (Z=-3.597, p<0.001), errors (Z=-3.317, p<0.001), and satisfaction (Z=-10.223, p<0.001). The results also show that there was a significant difference in participants perceived usefulness of the query tools Z=-4.672, p<0.001.