• Title/Summary/Keyword: 확률검색모형

Search Result 13, Processing Time 0.026 seconds

A Comparative Study on Effectiveness of Boole logic retrieval, Fuzzy retrieval and Probabilistic retrieval (불논리검색, 퍼지검색, 확률검색의 효율 비교연구)

  • 이젬마;사공철
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1994.12a
    • /
    • pp.15-18
    • /
    • 1994
  • 본 연구에서는 불논리검색의 단점을 보완하기 위한 가장 강력한 검색 모형인 퍼지검색과 확률검색의 효율을 불논리검색과 상호비교하였다. 실험데이터로 정보학 분야의 한국어 test collection인 KT Test Set을 이용하였고 색인어와 색인어의 문헌내 출현빈도를 바탕으로 퍼지시소러스를 생성하여 시소러스의 NT, BT로 탐색식을 확장한 다음 각각에 대해 3가지 검색을 행하고 검색효율을 평균재현율과 평균정확률로 측정하였다. 실험결과 검색효율은 재현율에서는 확률검색, 불논리검색, 퍼지검색 순으로. 정확률에서는 퍼지검색, 확률검색, 불논리검색 순으로 나타났다.

  • PDF

Application of the 2-Poisson Model to Full-Text Information Retrieval System (2-포아송 모형의 전문검색시스템 응용에 관한 연구)

  • 문성빈
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.3
    • /
    • pp.49-63
    • /
    • 1999
  • The purpose of this study is to investigate whether the terms in queries are distributed according to the 2-Poisson model in the documents represented by abstract/title or full-text. In this study, retrieval experiments using Binary independence and 2-Poisson independence model, which are based on the probabilistic theory, were conducted to see if the 2-Poisson distribution of the query terms has an influence on the retrieval effectiveness, particularly of full-text information retrieval system.

  • PDF

A probabilistic information retrieval model by document ranking using term dependencies (용어간 종속성을 이용한 문서 순위 매기기에 의한 확률적 정보 검색)

  • You, Hyun-Jo;Lee, Jung-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.763-782
    • /
    • 2019
  • This paper proposes a probabilistic document ranking model incorporating term dependencies. Document ranking is a fundamental information retrieval task. The task is to sort documents in a collection according to the relevance to the user query (Qin et al., Information Retrieval Journal, 13, 346-374, 2010). A probabilistic model is a model for computing the conditional probability of the relevance of each document given query. Most of the widely used models assume the term independence because it is challenging to compute the joint probabilities of multiple terms. Words in natural language texts are obviously highly correlated. In this paper, we assume a multinomial distribution model to calculate the relevance probability of a document by considering the dependency structure of words, and propose an information retrieval model to rank a document by estimating the probability with the maximum entropy method. The results of the ranking simulation experiment in various multinomial situations show better retrieval results than a model that assumes the independence of words. The results of document ranking experiments using real-world datasets LETOR OHSUMED also show better retrieval results.

Estimation of performance for random binary search trees (확률적 이진 검색 트리 성능 추정)

  • 김숙영
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.2
    • /
    • pp.203-210
    • /
    • 2001
  • To estimate relational models and test the theoretical hypotheses of binary tree search algorithms, we built binary search trees with random permutations of n (number of nodes) distinct numbers, which ranged from three to seven. Probabilities for building binary search trees corresponding to each possible height and balance factor were estimated. Regression models with variables of number of nodes, height, and average number of comparisons were estimated and the theorem of O(1g(n)) was accepted experimentally by a Lack of Test procedure. Analysis of Variance model was applied to compare the average number of comparisons with three groups by height and balance factor of the trees to test theoretical hypotheses of a binary search tree performance statistically.

  • PDF

A Study of Probabilistic Information Retrieval System Using Logical Pattern (논리적 패턴을 이용한 확률화 정보검색 시스템의 연구)

  • 이윤오;이정진
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.1
    • /
    • pp.1-10
    • /
    • 2000
  • 정보화사회에서 효율적인 정보검색(information retrieval)은 각종 의사결정에 매우 중요하다. 주어진 정보검색 문제가 있을 때 과거에 검색되었던 자료는 그 적절성 여부에 대한 평가를 데이터베이스에 첨가하여 지식베이스(knowledge base)화 할 수 있다. 본 연구는 이 지식베이스에 대한 논리적 패턴을 분석하여 새로운 정보의 '적정성(relevance)' 여부를 판별하는 확률화 정보검색 모형을 만들고 이에 대한 실험을 하였다.

  • PDF

Enhancing performance of full-text retrieval systems using relevance feedback (적합성피이드백을 이용한 전문검색시스템의 검색효율성 증진을 위한 연구)

  • 문성빈
    • Journal of the Korean Society for information Management
    • /
    • v.10 no.2
    • /
    • pp.43-67
    • /
    • 1993
  • The primary purpose of the study is to improve the low preclslon often found In full-text retrleval systems. In order to enhance the low precision of full-text retrleval wh~le retaining ~ t s hgh recall, relevance feedback mechanisms based on probabilistic retrieval models (binary independence and two-Polsson Independence models) were employed. Thls paper investigates the effect of relevance feedback on the performance of full-text retrieval systems.

  • PDF

Quantitative approach to analyze searching efficiencies varying degrees of imbalance in a binary search tree (수량적 접근 방법에 의한 이진 검색 트리 불균형도에 따른 검색 성능 비교 분석)

  • 김숙영
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.2
    • /
    • pp.235-242
    • /
    • 2002
  • To minimize restructuring cost of a tree, experiments were conducted to collect quantitative information of searching efficiencies varying degrees of imbalance in a binary search tree. Degrees of tree imbalance were measured by a balance factor, an absolute value of height difference of left subtree and right subtree in a binary search tree. The average number of comparisons increased (p<0.01), and searching efficiency of O(n) was more appropriate rather than O(logn), as degrees of imbalance in a binary search tree deteriorated. However, there were no significant differences of searching efficiencies in height balanced trees and trees with subtrees to have height 3 less than the other (p>0.05). Therefore, the findings would be applicable to maintain searching efficiency of a software with a binary search tree.

  • PDF

A Study on the Design of Portal Site (전문 포털사이트 구축에 관한 연구)

  • 곽승진
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1999.08a
    • /
    • pp.113-116
    • /
    • 1999
  • 인터넷 이용 경험이 적은 초보자뿐만 아니라 경험자도 인터넷에서 자신에게 필요한 정보를 찾기는 쉽지 않다. 인터넷 정보자원은 급증하는 반면 인터넷 검색엔진이 원하는 정보를 찾아주는 확률은 오히려 줄어들고 있는데 이는 체계적으로 정리되어 있지 않기 때문이다. 본 연구는 인터넷의 잔문 즉 웹브라우저를 실행한 후 처음 접속하는 사이트인 포털사이트(Portal site)를 전문화하여 궁극적으로 포털을 정보이용의 최종 목적지가 될 수 있는 모형을 설계하였다. 이 모형은 도서관이 보유하고 있는 학술정보 뿐만 아니라 인터넷의 정보자원을 체계적으로 정리하고 부가서비스를 제공하여 이용자의 정보이용 편의성을 높이는데 있다.

  • PDF

An Evaluation of an Information Sharing Workflow Using Data Provenance Semantics (데이터 생성의미를 활용한 정보공유구조의 효과성 비교 연구)

  • Lee, Choon Yeul
    • Journal of Digital Convergence
    • /
    • v.11 no.6
    • /
    • pp.175-185
    • /
    • 2013
  • For effective information sharing, data provenance semantics need to be managed effectively. Based on a scheme to represent data provenance semantics, we propose a model to calculate information sharing costs. Information sharing costs are derived from probabilities of type I and type II errors that occur in organizational information sharing, costs related to these errors, and information sharing distances between organizational units which are determined by information sharing workflows. We apply the model to various types of information sharing workflows including departmental information systems, hierarchical information systems, a hub and a stand-alone system. The calculated information sharing costs show that the hub with data standardization is best in information sharing; however without standardization its information sharing cost deteriorates to that of a departmental information system. And, any information sharing workflow is better than a stand-alone system. It is proved that the model is useful in analyzing effectiveness of information sharing workflows and their characteristics.

A Study on Design Requirements for Smart Parking Services Considering User'S Stated Preferences (사용자 잠재선호특성을 고려한 스마트 주차서비스 설계요건 연구)

  • Jang, Jeong-Ah;Lee, Hyun-Mi;Lee, Won-Woo;Kim, Hyeon-Mi;Kim, Tae-Hyung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1279-1286
    • /
    • 2021
  • This study suggests the user's needs for a smart parking service that enables parking lot search and advance reservation service, and is a study on the user's preference selection model related to fees (reservation fee, penalty fee), etc. Two types of user preference models in the form of logit models were constructed by composing a response questionnaire for smart parking service. The first is a model for selecting a smart parking lot, which suggests a situation in which the probability of selection is higher than that of a general parking lot in the relationship between usage fee and cost. The second is a parking ticket reservation discount selection model, and the smart parking service selection probability was analyzed through the relationship model between the reservation amount and the penalty. It can be used as a design requirement that enables sophisticated and various types of smart parking service considering users' preferences.