• Title/Summary/Keyword: web test retrieval

Search Result 32, Processing Time 0.031 seconds

A Study on the use of WWW search engines of librarians for the internet information retrieval (사서들의 효율적인 인터넷 정보검색을 위한 WWW 탐색엔진 이용에 관한 연구)

  • 김성희
    • The Journal of Information Technology and Database
    • /
    • v.6 no.1
    • /
    • pp.27-46
    • /
    • 1999
  • This study was intended to find the use patterns of internet search engines of librarians and to measure the relationship between internet use frequency and the use behavior of internet search engines. The results showed that librarians use Web search engines for academic information retrieval and are satisfied with the search results. The major problems when librarians use search engines were that search engines retrieve many non-relevant documents. As a result of hypotheses test, the relationship between internet frequency and the preference of search engines was not significantly different. On the other hand, the hypotheses that internet frequency affects satisfaction of search results, recognition of importance of search engines, and the need of retraining of librarians for internet information retrieval were shown to be significant.

  • PDF

An Implementation of Best Match Algorithm for Korean Text Retrieval in the Client/Server Environment (클라이언트 서버 환경에서 한글텍스트 검색을 위한 베스티매치 알고리즘의 구현)

    • Journal of Korean Library and Information Science Society
    • /
    • v.32 no.1
    • /
    • pp.249-260
    • /
    • 2001
  • This paper presents the application of best match search algorithm in the client/server system for natural language access to Web-based database. For this purpose, the procedures to process Korean word variants as well as to execute probabilistic weighting scheme have been implemented in the client/server system. The experimental runs have been done using a Korean test set which included documents, queries and relevance judgements. The experimental results demonstrate that best match retrieval with relevance information is better than the retrieval without it.

  • PDF

Research on Function and Policy for e-Government System using Semantic Technology (전자정부내 의미기반 기술 도입에 따른 기능 및 정책 연구)

  • Go, Gwang-Seop;Jang, Yeong-Cheol;Lee, Chang-Hun
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2007.06a
    • /
    • pp.79-87
    • /
    • 2007
  • This paper aims to offer a solution based on semantic document classification to improve e-Government utilization and efficiency for people using their own information retrieval system and linguistic expression Generally, semantic document classification method is an approach that classifies documents based on the diverse relationships between keywords in a document without fully describing hierarchial concepts between keywords. Our approach considers the deep meanings within the context of the document and radically enhances the information retrieval performance. Concept Weight Document Classification(CoWDC) method, which goes beyond using exist ing keyword and simple thesaurus/ontology methods by fully considering the concept hierarchy of various concepts is proposed, experimented, and evaluated. With the recognition that in order to verify the superiority of the semantic retrieval technology through test results of the CoWDC and efficiently integrate it into the e-Government, creation of a thesaurus, management of the operating system, expansion of the knowledge base and improvements in search service and accuracy at the national level were needed.

  • PDF

A Study on eDocument Management Using Professional Terminologies (전문용어기반 eDocument 관리 방안에 관한 연구)

  • 김명옥
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.2
    • /
    • pp.21-38
    • /
    • 2002
  • Document retrieval (DR) has been a serious issue for long in the field of Office Information Management. Nowadays, our daily work is becoming heavily dependent on the usage of information collected from the internet, and the DR methods on the Web has become an important issue which is studied more than any other topic by many researchers. The main purpose of this study is to develop a model to manage business documents by integrating three major methodologies used in the field of electronic library and information retrieval: Metadata, Thesaurus, and Index/Reversed Index. In addition, we have added a new concept of eDocument, which consists of metadata about unit documents and/or unit document themselves. eDocument is introduced as a way to utilize existing document sources. The core concepts and structures of the model were introduced, and the architecture of the eDocument management system has been proposed. Test (simulation) result of the model and the direction for the future studies were also mentioned.

  • PDF

Design and Implementation of Educational Contents Sharing and Retrieval System using Mobile Agent (이동 에이전트를 이용한 교육용 컨텐츠 공유 및 검색 시스템의 설계 및 구현)

  • Lee, Chul-Hwan;Han, Sun-Gwan
    • The Journal of Korean Association of Computer Education
    • /
    • v.5 no.4
    • /
    • pp.71-78
    • /
    • 2002
  • The mobile agent is receiving the attention as new technique to retrieve and to share the distributed contents on web-based educational systems. The retrieval using mobile agent uses the method that delivers the agent to accomplish a search in direct server in substitute for it is transmitted a many contents to accomplish an efficient search. This study proposed the model of retrieval system that shares and searches the distributed educational contents of the bulletin board and newsgroup by using the mobile agent. In order to evaluate the efficiencies of the system that is proposed, we did the comparison test on the search model of existing and proposing system. By a test result, we confirmed network traffics of proposed system are diminished. Moreover we gave proof the fact that the optimum search time of the mobile agent-based system is shortened.

  • PDF

Modeling User Preference based on Bayesian Networks for Office Event Retrieval (사무실 이벤트 검색을 위한 베이지안 네트워크 기반 사용자 선호도 모델링)

  • Lim, Soo-Jung;Park, Han-Saem;Cho, Sung-Bae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.6
    • /
    • pp.614-618
    • /
    • 2008
  • As the multimedia data increase a lot with the rapid development of the Internet, an efficient retrieval technique focusing on individual users is required based on the analyses of such data. However, user modeling services provided by recent web sites have the limitation of text-based page configurations and recommendation retrieval. In this paper, we construct the user preference model with a Bayesian network to apply the user modeling to video retrieval, and suggest a method which utilizes probability reasoning. To do this, context information is defined in a real office environment and the video scripts acquired from established cameras and annotated the context information manually are used. Personal information of the user, obtained from user input, is adopted for the evidence value of the constructed Bayesian Network, and user preference is inferred. The probability value, which is produced from the result of Bayesian Network reasoning, is used for retrieval, making the system return the retrieval result suitable for each user's preference. The usability test indicates that the satisfaction level of the selected results based on the proposed model is higher than general retrieval method.

A Study on the Improvement of Retrieval Efficiency Based on the CRFMD (공통기술표현포맷에 기반한 다매체자료의 검색효율 향상에 관한 연구)

  • Park, Il-Jong;Jeong, Ki-Tai
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.3 s.61
    • /
    • pp.5-21
    • /
    • 2006
  • In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and have progressed quickly with the rapid progress in data processing speeds. This study proposes a common representation format for multimedia documents (CRFMD) composed of both images and text to form a single data structure. It also shows that image classification of a given test set is dramatically improved when text features are encoded together with image features. CRFMD might be applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval.

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

  • Lee, Hong-Joo;Klein, Mark
    • Asia pacific journal of information systems
    • /
    • v.18 no.1
    • /
    • pp.79-96
    • /
    • 2008
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.

Evaluation of the documents from the Web-based Question and Answer Service (지식 검색 서비스 개선을 위한 문서의 적합도 및 신뢰도 분석)

  • Park So-Yeon;Lee Joon-Ho;Jeon Ji-Woon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.2
    • /
    • pp.299-314
    • /
    • 2006
  • This study suggests evaluation criteria for the web-based question-answer databases provided by major Korean search portals. In particular, this study suggests evaluation criteria for the relevance of question titles, entire questions, and answer's. The evaluation criteria for the qualify of answers are also developed. Based on these criteria. evaluation of documents from Naver Knowledge-in are performed. The results of this study can be implemented to the development of test collection of question-answer databases. The implications for system designers and web content providers are discussed.

Spatial Index based on Main Memory for Web CIS (Web GIS를 위한 주기억 장치 기반 공간 색인)

  • 김진덕;진교홍
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.10a
    • /
    • pp.191-194
    • /
    • 2001
  • The availability of the inexpensive, large main memories coupled with the demand for faster response time are bringing a new perspective to database technology. The Web GIS used by u unspecified number of general public in the internet needs high speed response time and frequent data retrieval for spatial analysis rather than data update. Therefore, it is appropriate to use main memory as a underlying storage structures for the Web GIS data. In this paper, we propose a data representation method based on relative coordinates and the size of the MBR. The method is able to compress the spatial data widely used in the Web GIS into smaller volume of memory. We also propose a memory resident spatial index with simple mechanism for processing point and region queries. The performance test shows that the index is suitable for managing the skewed data in terms of the size of the index and the number of the MBR intersection check operations.

  • PDF