• Title/Summary/Keyword: Information Retrieval Engine

Search Result 136, Processing Time 0.127 seconds

Web Service based Recommendation System using Inference Engine (추론엔진을 활용한 웹서비스 기반 추천 시스템)

  • Kim SungTae;Park SooMin;Yang JungJin
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.3
    • /
    • pp.59-72
    • /
    • 2004
  • The range of Internet usage is drastically broadened and diversed from information retrieval and collection to many different functions. Contrasting to the increase of Internet use, the efficiency of finding necessary information is decreased. Therefore, the need of information system which provides customized information is emerged. Our research proposes Web Service based recommendation system which employes inference engine to find and recommend the most appropriate products for users. Web applications in present provide useful information for users while they still carry the problem of overcoming different platforms and distributed computing environment. The need of standardized and systematic approach is necessary for easier communication and coherent system development through heterogeneous environments. Web Service is programming language independent and improves interoperability by describing, deploying, and executing modularized applications through network. The paper focuses on developing Web Service based recommendation system which will provide benchmarks of Web Service realization. It is done by integrating inference engine where the dynamics of information and user preferences are taken into account.

  • PDF

A Retrieval Technique of Personal Information in a Web Environment (웹 환경에서의 개인정보 검색기법)

  • Seo, Young-Duk;Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.4
    • /
    • pp.145-151
    • /
    • 2015
  • Since we use internet every day, the internet privacy has become important. We need to find out what kinds of personal information is exposed to the internet and to eliminate the exposed information. However, it is not efficient to search the personal information using only fragmentary clues in web search engines because the ranking results are not relevant to the exposure degree of personal information. In this paper, we introduced a personal information retrieval system and proposed a process to remove private data from the web easily. We also compared our proposed method with previous methods by evaluating the search performance.

Some Legal Arguments on the Portal Service Providers' Information Retrieval (포털사업자의 검색서비스에 관한 법률문제)

  • Kim, Yun-Myung
    • Journal of Information Management
    • /
    • v.38 no.3
    • /
    • pp.183-209
    • /
    • 2007
  • The representative example of the business model on internet environment, the business of the Naver, Empas and Google which provides information retrieval service is the internet portal. The portal sites provide information retrieval service which provides users information what they want to find, that is a huge social contribution. The portal site which provides a search service leads much problems. Consequently, the regulation against information retrieval is asserted powerfully in spite of the public interest. Namely, the regulation regarding the search business owner is tried. Finally, portal business owner puts the social responsibility as OSP. But, there is a doubt that portal business owner who has much problem which occurred on the portal site indirectly has responsibility directly. That is duty on portal site owner the censorship on the contents transferred. So, this thesis researches on the social critical opinion relating with a information retrieval from the legal side against the problem of the Internet.

A Study on the DB-IR Integration: Per-Document Basis Online Index Maintenance

  • Jin, Du-Seok;Jung, Hoe-Kyung
    • Journal of information and communication convergence engineering
    • /
    • v.7 no.3
    • /
    • pp.275-280
    • /
    • 2009
  • While database(DB) and information retrieval(IR) have been developed independently, there have been emerging requirements that both data management and efficient text retrieval should be supported simultaneously in an information system such as health care, customer support, XML data management, and digital libraries. The great divide between DB and IR has caused different manners in index maintenance for newly arriving documents. While DB has extended its SQL layer to cope with text fields due to lack of intact mechanism to build IR-like index, IR usually treats a block of new documents as a logical unit of index maintenance since it has no concept of integrity constraint. However, In the DB-IR integrations, a transaction on adding or updating a document should include maintenance of the posting lists accompanied by the document. Although DB-IR integration has been budded in the research filed, the issue will remain difficult and rewarding areas for a while. One of the primary reasons is lack of efficient online transactional index maintenance. In this paper, performance of a few strategies for per-document basis transactional index maintenance - direct index update, pulsing auxiliary index and posting segmentation index - will be evaluated. The result shows that the pulsing auxiliary strategy and posting segmentation indexing scheme, can be a challenging candidates for text field indexing in DB-IR integration.

Design and Realization of Retrieval Engine On Demand Using a Dynamic Robot Agent (동적 로봇에이전트를 이용한 주문형 검색엔진의 설계 및 구현)

  • Kim, Sung;Park, Chol-Woo;Lee, Chung-Seok;Park, Kyoo-Seok
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.631-636
    • /
    • 2001
  • The technologies relevant to e-business have rapidly developed during very short period of time and recently it is expanding to the area of B2B. Keeping pace with this development in e-business, the information of comparison or analysis on commodities of a lot of sites is also required. Though the information on price comparison among internal shopping malls are now being offered, its not efficient for its renewing intervals are long and, due to some indiscreet collection of information for the purpose of fast renewal, much loads are being generated on the pertinent shopping malls. In this article, the retrieval engine on demand is designed and realized using a dynamci robot agent changing kinetically on the status of the pertinent shopping malls that can offer the customized service and presents the shopping malls with the lowest price for each commodity under e-business after the shortest time of collection and analysis while not giving loads to the pertinent shopping malls.

  • PDF

Known-Item Retrieval Performance of a PICO-based Medical Question Answering Engine

  • Vong, Wan-Tze;Then, Patrick Hang Hui
    • Asia pacific journal of information systems
    • /
    • v.25 no.4
    • /
    • pp.686-711
    • /
    • 2015
  • The performance of a novel medical question-answering engine called CliniCluster and existing search engines, such as CQA-1.0, Google, and Google Scholar, was evaluated using known-item searching. Known-item searching is a document that has been critically appraised to be highly relevant to a therapy question. Results show that, using CliniCluster, known-items were retrieved on average at rank 2 ($MRR@10{\approx}0.50$), and most of the known-items could be identified from the top-10 document lists. In response to ill-defined questions, the known-items were ranked lower by CliniCluster and CQA-1.0, whereas for Google and Google Scholar, significant difference in ranking was not found between well- and ill-defined questions. Less than 40% of the known-items could be identified from the top-10 documents retrieved by CQA-1.0, Google, and Google Scholar. An analysis of the top-ranked documents by strength of evidence revealed that CliniCluster outperformed other search engines by providing a higher number of recent publications with the highest study design. In conclusion, the overall results support the use of CliniCluster in answering therapy questions by ranking highly relevant documents in the top positions of the search results.

WebDBs : A User oriented Web Search Engine (WebDBs: 사용자 중심의 웹 검색 엔진)

  • 김홍일;임해철
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.7B
    • /
    • pp.1331-1341
    • /
    • 1999
  • This paper propose WebDBs(Web Database system) which retrieves information registered in web using query language similar to SQL. This proposed system automatically extracts information which is needed to retrieve from HTML documents dispersed in web. Also, it has an ability to process SQL based query intended for the extracted information. Web database system takes the most of query processing time for capturing documents going through network line. And so, the information previously retrieved is reused in similar applications after stored in cache in perceiving that most of the web retrieval depends on web locality. In this case, we propose cache mechanism adapted to user applications by storing cached information associated with retrieved query. And, Web search engine is implemented based on these concepts.

  • PDF

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

  • Lee, Hong-Joo;Klein, Mark
    • Asia pacific journal of information systems
    • /
    • v.18 no.1
    • /
    • pp.79-96
    • /
    • 2008
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.

Effective Scheme for File Search Engine in Mobile Environments (모바일 환경에서 파일 검색 엔진을 위한 효과적인 방식)

  • Cho, Jong-Keun;Ha, Sang-Eun
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.11
    • /
    • pp.41-48
    • /
    • 2008
  • This study focuses on the modeling file search engine and suggesting modified file search schema based on weight value using file contents in order to improve the performance in terms of search accuracy and matching time. Most of the file search engines have used string matching algorithms like KMP(Knuth.Morris.Pratt), which may limit portability and fast searching time. However, this kind of algorithms don't find exactly the files what you want. Hence, the file search engine based on weight value using file contents is proposed here in order to optimize the performance for mobile environments. The Comparison with previous research shows that the proposed schema provides better.

Design and Implementation of GIS using Servlet on the Internet (인터넷에서 서블릿을 이용한 지리정보시스템의 설계 및 구현)

  • 김병학
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2001.06a
    • /
    • pp.49-52
    • /
    • 2001
  • In this paper, the design and implementation of the Geographic Information Retrieval System for the ArcView is described. The environments for the system configurations include a PC server under Linux Operating System, Apache Web-server, and Oracle as database engine. In addition, JSP(Java Server page) and Servlet is used to view database and Map-Image.

  • PDF