• Title/Summary/Keyword: Information Retrieval Engine

Search Result 136, Processing Time 0.026 seconds

Improving Performance of Web Search Engine using Query Word Senses and User Feedback (질의어 의미정보와 사용자 피드백을 이용한 웹 검색엔진의 성능향상)

  • Yoon, Sung-Hee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.2
    • /
    • pp.280-285
    • /
    • 2007
  • This paper proposes a technique improving performance using word senses and user feedback in web information retrieval, compared with the retrieval based on ambiguous user query and index. Disambiguation using word senses is very important processing for improving performance by eliminating the irrelevant pages from the result. According to semantic categories of nouns which are used as index for retrieval, we build the word sense knowledge-base and categorize the web pages. It can improve the performance of retrieval system with user feedback deciding the query sense and information seeking behavior to web pages.

  • PDF

Design and Algorithm Implementation of a Distributed Information Retrieval System using Sequential Transferring Method(STM) (순차적 전달방식(STM)을 이용한 분산정보검색시스템의 설계 및 알고리즘 구현)

  • Yoon, Hee-Byung;Kim, Yong-Han;Kim, Hwa-Soo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.603-610
    • /
    • 2004
  • The distributed Information Retrieval System centrally controlled by mediator or meta search engine result in congestion of heavy traffic and int he problem of increment of cost for the reason of the design of complicated algorithm for central control and installation of hardware. So to figure out this problem, the way is needed that has independent retrieval functionality and can cooperate each other without dependency. In this paper, we overview a few works involved in distributed information retrieval system, then, implement algorithm and design the frame-work of distributed information retrieval system using sequential transferring method(STM) including multiple information retrieval system separated from central control. For this first of all, we present a web partition policy which devide and manage web logically and we present the sequential query processing way by means of illustration through changing numbered information retrieval system. Then, we also present 3-layered structure of framework and function and module of each layer suitable for information retrieval system. Last of ail, for effective implementation of STM algorithm we analysis module structure and present description of pseudocode of this, and show that the proposed STM algorithm works smoothly by demonstration of sequential query transfer process between servers.

The Study on Information Retrieval Methods for Elementary School Students (초등학생 정보검색 방법에 대한 연구)

  • Jang, SeJi;Chun, Seok-Ju
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.01a
    • /
    • pp.227-230
    • /
    • 2014
  • 현대인들은 인터넷의 등장과 함께 지식정보의 풍요 속에 살고 있다. 지식정보의 풍요는 언제, 어디서나 내가 원하는 정보를 탐색 및 검색, 분석, 생성할 수 있다는 것을 의미한다. 지식정보화 사회에 초등학교 현장에서 가장 흔히 활용되는 능력은 정보검색이다. 정보검색은 쉬워 보이지만, 넘쳐나는 정보의 홍수 속에서 올바르고 정확한 정보를 찾아내기란 초등학생들에게 어려운 일이 아닐 수 없다. 이에 본 연구에서는 웹기반에서의 초등학생들이 학습주제에 따라 효율적으로 활용할 수 있는 정보검색방법에 대해 논하고자 한다.

  • PDF

Implement on Search Machine using Open Source Framework (오픈 소스 프레임워크를 활용한 검색엔진 구현)

  • Song, Hyun-Ok;Kim, A-Yong;Jung, Hoe-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.3
    • /
    • pp.552-557
    • /
    • 2015
  • IT technology development and smart appliances due to the increased use of a lot of data on production and consumption has become in the internet. Because this is why importance of information retrieval technology although the growing becoming aware of the difficult techniques to access the required of lot a background knowledge on information retrieval technology. However, the Lucene due to emerge provide to background can implement on search engine by using the Lucene of lack background knowledge for search technology. In this paper, suggest to implement on search engine by using the developed a framework on Lucene-based. Suggest a frameworks are use in the search engines on have guarantee in server environment support on distributed processing and distributed storage, and high availability by using the Hadoop and Nutch, Solr, Zookeeper.

Design & Evaluation of an Intelligent Model for Extracting the Web User' Preference (웹 사용자의 선호도 추출을 위한 지능모델 설계 및 평가)

  • Kim, Kwang-Nam;Yoon, Hee-Byung;Kim, Hwa-Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.443-450
    • /
    • 2005
  • In this paper, we propose an intelligent model lot extraction of the web user's preference and present the results of evaluation. For this purpose, we analyze shortcomings of current information retrieval engine being used and reflect preference weights on learner. As it doesn't depend on frequency of each word but intelligently learns patterns of user behavior, the mechanism Provides the appropriate set of results about user's questions. Then, we propose the concept of preference trend and its considerations and present an algorithm for extracting preference with examples. Also, we design an intelligent model for extraction of behavior patterns and propose HTML index and process of intelligent learning for preference decision. Finally, we validate the proposed model by comparing estimated results(after applying the Preference) of document ranking measurement.

Intelligent Retrieval System for finding important travel information (중요 여행 정보를 찾기 위한 지능 검색 시스템)

  • Yun, Un-Il;Shin, Hyeon-Il;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.11
    • /
    • pp.113-121
    • /
    • 2009
  • The increasing interest in leisure activities of a five-day work per week has been recently prevailed. Additionally, as internet and mobile infrastructures have been becoming widespread, the user can get specific information using a search engine. However, it is difficult for the user to get accurate information they really want as shared information has been rapidly increased and the information has been searched. For example, users can retrieve required travel information, but they also must see a huge number of travel advertisements. In this paper, we design and implement a retrieval system using travel information collecting agent. The information gathering agent regularly visits travel-related category pages of the portal sites and major media travel-article pages to collect information related to travel, and the agent stores the gathered information to a database. Then, users can search the travel information conveniently without the need to view advertisements.

Retrieval algorithm for Web Document using XML DOM (XML DOM을 이용한 웹문서 검색 알고리즘)

  • 김노환;정충교
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.6
    • /
    • pp.775-782
    • /
    • 2001
  • Until recently Web retrieval engine has presented a demanded document to users according to the amount and the frequency of inquired key words in each document under the assumption that the more key words a document has, the more accessible it is. This method of searching doesn't matter to a normal document such as HTML Web data in which structural information is not involved. However, Web data realized in XML contains structural information and modeling of graphic forms is also available. Therefore, in the case of XML, this method leads to no less trouble since it depends only on the frequency of key words. We consider that this problem can be resolved by way of inquiry which is similar to SQL. This form of inquiry enables us to snatch an exact data we want in a quick and clear way with a full advantage of structural quality of XML, overcoming the shortcomings of frequency-based engine. In this paper, We aim to design a model of information retrieval system of XML data using XML DOM and consider its algorithm related with it.

  • PDF

Region-based Content Retrieval Algorithm Using Image Segmentation (영상 분할을 이용한 영역기반 내용 검색 알고리즘)

  • Rhee, Kang-Hyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.5
    • /
    • pp.1-11
    • /
    • 2007
  • As the availability of an image information has been significantly increasing, necessity of system that can manage an image information is increasing. Accordingly, we proposed the region-based content retrieval(CBIR) algorithm based on an efficient combination of an image segmentation, an image texture, a color feature and an image's shape and position information. As a color feature, a HSI color histogram is chosen which is known to measure spatial of colors well. We used active contour and CWT(complex wavelet transform) to perform an image segmentation and extracting an image texture. And shape and position information are obtained using Hu invariant moments in the luminance of HSI model. For efficient similarity computation, the extracted features(color histogram, Hu invariant moments, and complex wavelet transform) are combined and then precision and recall are measured. As a experimental result using DB that was supported by www.freefoto.com. the proposed image retrieval engine have 94.8% precision, 82.7% recall and can apply successfully image retrieval system.

Korean-Chinese Person Name Translation for Cross Language Information Retrieval

  • Wang, Yu-Chun;Lee, Yi-Hsun;Lin, Chu-Cheng;Tsai, Richard Tzong-Han;Hsu, Wen-Lian
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.489-497
    • /
    • 2007
  • Named entity translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating person names, the most common type of name entity in Korean-Chinese cross language information retrieval (KCIR). Unlike other languages, Chinese uses characters (ideographs), which makes person name translation difficult because one syllable may map to several Chinese characters. We propose an effective hybrid person name translation method to improve the performance of KCIR. First, we use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. Second, we adopt the Naver people search engine to find the query name's Chinese or English translation. Third, we extract Korean-English transliteration pairs from Google snippets, and then search for the English-Chinese transliteration in the database of Taiwan's Central News Agency or in Google. The performance of KCIR using our method is over five times better than that of a dictionary-based system. The mean average precision is 0.3490 and the average recall is 0.7534. The method can deal with Chinese, Japanese, Korean, as well as non-CJK person name translation from Korean to Chinese. Hence, it substantially improves the performance of KCIR.

  • PDF

XML Document Retrieval Models for Heterogeneous Data Set using Independent Regular paths (독립적인 질의 경로들을 사용하여 이질적인 문서들을 검색하는 XML 문서 검색 모델)

  • 유신재;민경섭;김형주
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.140-152
    • /
    • 2003
  • An XML document has a structure which may be irregular. It is difficult for end-users to comprehend the irregular document structure exactly. For these XML documents, an end-user has a difficulty in using structured query. Therefore, an end-user formulates no structured query or a query which has a little structure information. In this context, we propose new retrieval models which use the structured information for ranking and compensate the difference between user query structure and document structure. To ease with querying, we assume the independence among querying paths which represent structural constraints. Since this assumption makes degradation of the expression power of a query language, we also propose a model which overcome this problem. As there had been no test collections for XML documents, we made a small test collection from TIPSTER of the RTEC and experimented on this collection without a structured query, From this experiment, we showed that our models improve average precision about 67% over conventional Vector-Space model.