• 제목/요약/키워드: information retrieval model

검색결과 625건 처리시간 0.028초

컬러에지의 벡터적 결합을 이용한 e-카탈로그 영상 검색 (e-Catalogue Image Retrieval Using Vectorial Combination of Color Edge)

  • 황의선;박상근;전준철
    • 정보처리학회논문지B
    • /
    • 제9B권5호
    • /
    • pp.579-586
    • /
    • 2002
  • 영상의 에지정보를 이용한 내용기반 영상 검색 방법은 현재 MPEG-7(Moving Picture Experts Group) 에서 제안된 에지 서술자(edge descriptor)가 대표적인 방법이며, 이때 사용된 에지의 정보는 영상의 명암도에 따른 에지히스토그램을 이용하고 있다. 본 논문에서는 새로운 컬러 에지 추출 방법을 제시하고, 제안된 방법에 의해 컬러 에지히스토그램을 특징 값으로 하는 내용기반 영상검색 방법을 제시하였다. 아울러 제안된 방법에 기반하여 인터넷 쇼핑몰에서 사용되는 e-카탈로그 상품 영상 검색에 적용하였다. 성능평가를 위하여 기존 MPEG-7에서 제시된 에지히스토그램에 의한 영상검색 방법과 비교하여 보았으며 실험결과 제안된 방법이 검색에 있어서 우수함을 입증할 수 있었다. 컬러에지의 추출은 컬러 영상의 R,G,B 채널의 각 성분의 벡터적 결합방법과 에지 맵의 벡터 노름(norm) 특성화를 통하여 이루어진다. 결과적으로 내용기반 영상 검색은 생성된 최종 에지모델이 갖는 에지의 방향성을 이용한 컬러 에지히스토그램을 통하여 수행된다.

Topic Level Disambiguation for Weak Queries

  • Zhang, Hui;Yang, Kiduk;Jacob, Elin
    • Journal of Information Science Theory and Practice
    • /
    • 제1권3호
    • /
    • pp.33-46
    • /
    • 2013
  • Despite limited success, today's information retrieval (IR) systems are not intelligent or reliable. IR systems return poor search results when users formulate their information needs into incomplete or ambiguous queries (i.e., weak queries). Therefore, one of the main challenges in modern IR research is to provide consistent results across all queries by improving the performance on weak queries. However, existing IR approaches such as query expansion are not overly effective because they make little effort to analyze and exploit the meanings of the queries. Furthermore, word sense disambiguation approaches, which rely on textual context, are ineffective against weak queries that are typically short. Motivated by the demand for a robust IR system that can consistently provide highly accurate results, the proposed study implemented a novel topic detection that leveraged both the language model and structural knowledge of Wikipedia and systematically evaluated the effect of query disambiguation and topic-based retrieval approaches on TREC collections. The results not only confirm the effectiveness of the proposed topic detection and topic-based retrieval approaches but also demonstrate that query disambiguation does not improve IR as expected.

모바일 환경에서 의미 기반 이미지 어노테이션 및 검색 (Semantic Image Annotation and Retrieval in Mobile Environments)

  • 노현덕;서광원;임동혁
    • 한국멀티미디어학회논문지
    • /
    • 제19권8호
    • /
    • pp.1498-1504
    • /
    • 2016
  • The progress of mobile computing technology is bringing a large amount of multimedia contents such as image. Thus, we need an image retrieval system which searches semantically relevant image. In this paper, we propose a semantic image annotation and retrieval in mobile environments. Previous mobile-based annotation approaches cannot fully express the semantics of image due to the limitation of current form (i.e., keyword tagging). Our approach allows mobile devices to annotate the image automatically using the context-aware information such as temporal and spatial data. In addition, since we annotate the image using RDF(Resource Description Framework) model, we are able to query SPARQL for semantic image retrieval. Our system implemented in android environment shows that it can more fully represent the semantics of image and retrieve the images semantically comparing with other image annotation systems.

인용문헌에 의한 정보검색 효과에 관한 고찰 (A Study on Information Retrieval Effectiveness by Cited References)

  • 이란주
    • 한국문헌정보학회지
    • /
    • 제27권
    • /
    • pp.265-289
    • /
    • 1994
  • Databases publicly available for online searching permit both citation and subject searching, however, subject searching has dominated the online search environment. Despite the power of citation searching, it may be underutilized This study explored the relationship between the number of cited references used in a citation search and information retrieval effectiveness, a relatively unstudied phenomenon. Three articles in the library and information science literature were chosen to represent sample questions. Cited reference searches were conducted for each article and each of its references. All searches were conducted in Social Scisearch and Scisearch on DIALOG. Relevance judgments on the retrieved citations were obtained from the authors of the original articles. This research focused on analyzing, in terms of information retrieval effectiveness, the overlap among postings sets retrieved by various combinations of cited references. The findings from the three case studies clearly showed that the more cited references used for the citation search, the better the performance, in terms of retrieving more relevant documents, up to a point of diminishing retums. In addition, generally the overall level of overlap among relevant documents sets was found to be low. Therefore, if only some of the cited references among many candidates are used for a citation search, a significant proportion of relevant documents may be missed. The analysis of the characteristics of cited references provided the ways to predict which cited refereces would be useful to improve information retrieval. The findings of this comprehensive exploratory study are of interest for both theoretical and practical reasons. They contribute to the development of a theoretical model for the effective use of the citation search. This model might also be implemented in operational online systems. In addition, the findings potentially will help online searchers improve their search strategies using the citation search so that they can better achieve their information retrieval goals: the retrieval of items relevant to a given question and the suppression of nonrelevant items.

  • PDF

전문용어기반 eDocument 관리 방안에 관한 연구 (A Study on eDocument Management Using Professional Terminologies)

  • 김명옥
    • 한국전자거래학회지
    • /
    • 제7권2호
    • /
    • pp.21-38
    • /
    • 2002
  • Document retrieval (DR) has been a serious issue for long in the field of Office Information Management. Nowadays, our daily work is becoming heavily dependent on the usage of information collected from the internet, and the DR methods on the Web has become an important issue which is studied more than any other topic by many researchers. The main purpose of this study is to develop a model to manage business documents by integrating three major methodologies used in the field of electronic library and information retrieval: Metadata, Thesaurus, and Index/Reversed Index. In addition, we have added a new concept of eDocument, which consists of metadata about unit documents and/or unit document themselves. eDocument is introduced as a way to utilize existing document sources. The core concepts and structures of the model were introduced, and the architecture of the eDocument management system has been proposed. Test (simulation) result of the model and the direction for the future studies were also mentioned.

  • PDF

Learning Similarity with Probabilistic Latent Semantic Analysis for Image Retrieval

  • Li, Xiong;Lv, Qi;Huang, Wenting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권4호
    • /
    • pp.1424-1440
    • /
    • 2015
  • It is a challenging problem to search the intended images from a large number of candidates. Content based image retrieval (CBIR) is the most promising way to tackle this problem, where the most important topic is to measure the similarity of images so as to cover the variance of shape, color, pose, illumination etc. While previous works made significant progresses, their adaption ability to dataset is not fully explored. In this paper, we propose a similarity learning method on the basis of probabilistic generative model, i.e., probabilistic latent semantic analysis (PLSA). It first derives Fisher kernel, a function over the parameters and variables, based on PLSA. Then, the parameters are determined through simultaneously maximizing the log likelihood function of PLSA and the retrieval performance over the training dataset. The main advantages of this work are twofold: (1) deriving similarity measure based on PLSA which fully exploits the data distribution and Bayes inference; (2) learning model parameters by maximizing the fitting of model to data and the retrieval performance simultaneously. The proposed method (PLSA-FK) is empirically evaluated over three datasets, and the results exhibit promising performance.

CORBA기능을 이용한 정보검색시스템 통합에 관한 연구 (A Study on Information Retrieval Systems Integration Using Common Object Request Broker Architecture)

  • 최한석;김상미;남태우;손덕주
    • 정보관리학회지
    • /
    • 제13권2호
    • /
    • pp.223-242
    • /
    • 1996
  • 본 논문에서는 정보검색을 원하는 이용자들에게 시스템 및 DBMS의 이형성, 서로 다른 검색시스템 사용에 관계없이 단일 사용자 인터페이스를 통해 일관성 있는 질의 및 검색결과를 제공할 수 있는 CORBA기반의 정보검색시스템(DDIR/ORB) 통합모델을 제안한다. 본 논문에서 제안한 DDIR/ORB는 질의를 요구한느 클라이언트와 검색을 실행하는 응용서버 사이에 미들웨어베이스와 CD-ROM 텍스트 데이터베이스에 대한 접근 투명성을 보장하고 정보검색 결과에 대한 자유로운 데이터 교환 및 변환을 제공하며, 기존의 정보검색시스템의 재사용을 보장한다. DDIR/ORB 시스템 설계 및 구현에서 OMG IDL을 사용함으로써 인터페이스 복잡도가 감소되었고 구성요소들의 구현 비용을 최소화하였다.

  • PDF

음성 데이터베이스로부터의 효율적인 색인데이터베이스 구축과 정보검색 (The Extraction of Effective Index Database from Voice Database and Information Retrieval)

  • 박미성
    • 한국도서관정보학회지
    • /
    • 제35권3호
    • /
    • pp.271-291
    • /
    • 2004
  • 전자도서관과 같은 정보제공원은 이미지, 음성, 동영상 등과 같은 비정형 멀티미디어 데이터 서비스에 대한 요구를 받고 있다. 그리하여 본 연구에서는 음성 처리를 위해 어절생성기, 음절복원기, 형태소분석기, 교정기를 제안하였다. 제안한 음성처리 기술로 음성데이터베이스를 텍스트데이터베이스로 변환 한후 텍스트데이터베이스로부터 색인데이터베이스를 추출하였다. 그리고 추출한 색인데이터베이스로 텍스트와 음성의 내용기반정보검색에 활용할 수 있음을 보이기 위해 정보검색모델을 제안하였다.

  • PDF

개선된 chain code와 HMM을 이용한 내용기반 영상검색 (Content-based Image Retrieval using an Improved Chain Code and Hidden Markov Model)

  • 조완현;이승희;박순영;박종현
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 제13회 신호처리 합동 학술대회 논문집
    • /
    • pp.375-378
    • /
    • 2000
  • In this paper, we propose a novo] content-based image retrieval system using both Hidden Markov Model(HMM) and an improved chain code. The Gaussian Mixture Model(GMM) is applied to statistically model a color information of the image, and Deterministic Annealing EM(DAEM) algorithm is employed to estimate the parameters of GMM. This result is used to segment the given image. We use an improved chain code, which is invariant to rotation, translation and scale, to extract the feature vectors of the shape for each image in the database. These are stored together in the database with each HMM whose parameters (A, B, $\pi$) are estimated by Baum-Welch algorithm. With respect to feature vector obtained in the same way from the query image, a occurring probability of each image is computed by using the forward algorithm of HMM. We use these probabilities for the image retrieval and present the highest similarity images based on these probabilities.

  • PDF

효율적인 정보검색 및 관리를 위한 학술정보가공모델 연구 (Design of Data Processing Model for Efficient Retrieval and Management of Science and Technology Information)

  • 이석형;강남규;김한기;윤희준;한성근;윤화묵
    • 한국콘텐츠학회:학술대회논문집
    • /
    • 한국콘텐츠학회 2005년도 추계 종합학술대회 논문집
    • /
    • pp.442-445
    • /
    • 2005
  • 일반적으로 데이터 베이스를 구축하고 서비스하기 위한 대부분의 방식은 상용 DBMS를 이용하여 정보의 관리를 수행하고 사용자 검색은 정보검색 시스템을 사용하는, DBMS와 정보검색시스템의 연동방식을 채택하고 있다. 그러나 이러한 방법은 DBMS와 정보검색 시스템을 이중으로 운영해야하는 불편함이 있고, 데이터의 관리 및 처리를 위한 과정이 중복으로 이루어지는 단점이 있다. 따라서, 본 논문에서는 과학기술 정보 데이터를 관리하고 원문과 함께 이용자에게 정보서비스를 위해 필요한 KRISTAL-2002 정보검색관리시스템 기반의 DB 구축과 정보검색 및 관리 모델을 제시한다.

  • PDF