• Title/Summary/Keyword: Web Index

Search Result 421, Processing Time 0.022 seconds

An Unified Spatial Index and Visualization Method for the Trajectory and Grid Queries in Internet of Things

  • Han, Jinju;Na, Chul-Won;Lee, Dahee;Lee, Do-Hoon;On, Byung-Won;Lee, Ryong;Park, Min-Woo;Lee, Sang-Hwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.9
    • /
    • pp.83-95
    • /
    • 2019
  • Recently, a variety of IoT data is collected by attaching geosensors to many vehicles that are on the road. IoT data basically has time and space information and is composed of various data such as temperature, humidity, fine dust, Co2, etc. Although a certain sensor data can be retrieved using time, latitude and longitude, which are keys to the IoT data, advanced search engines for IoT data to handle high-level user queries are still limited. There is also a problem with searching large amounts of IoT data without generating indexes, which wastes a great deal of time through sequential scans. In this paper, we propose a unified spatial index model that handles both grid and trajectory queries using a cell-based space-filling curve method. also it presents a visualization method that helps user grasp intuitively. The Trajectory query is to aggregate the traffic of the trajectory cells passed by taxi on the road searched by the user. The grid query is to find the cells on the road searched by the user and to aggregate the fine dust. Based on the generated spatial index, the user interface quickly summarizes the trajectory and grid queries for specific road and all roads, and proposes a Web-based prototype system that can be analyzed intuitively through road and heat map visualization.

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.

Implementation of a Large-scale Web Query Processing System Using the Multi-level Cache Scheme (계층적 캐시 기법을 이용한 대용량 웹 검색 질의 처리 시스템의 구현)

  • Lim, Sung-Chae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.669-679
    • /
    • 2008
  • With the increasing demands of information sharing and searches via the web, the web search engine has drawn much attention. Although many researches have been done to solve technical challenges to build the web search engine, the issue regarding its query processing system is rarely dealt with. Since the software architecture and operational schemes of the query processing system are hard to elaborate, we here present related techniques implemented on a commercial system. The implemented system is a very large-scale system that can process 5-million user queries per day by using index files built on about 65-million web pages. We implement a multi-level cache scheme to save already returned query results for performance considerations, and the multi-level cache is managed in 4-level cache storage areas. Using the multi-level cache, we can improve the system throughput by a factor of 4, thereby reducing around 70% of the server cost.

Development of Composite Soil Quality Index Evaluation System based on Web GIS (Web GIS기반의 복합적 토양 질 평가 시스템 개발)

  • Sung, Yunsoo;Yang, Jae E;Kim, Sung Chul;Ryu, Jichul;Jang, Wonseok;Kum, Donghyuk;Lim, Kyoung Jae
    • Journal of Korean Society on Water Environment
    • /
    • v.31 no.6
    • /
    • pp.693-699
    • /
    • 2015
  • It has been known that torrential rainfall events have been occurring worldwide due to climate change. The accelerated soil erosion has caused negative impacts on water quality and ecosystem of receiving waterbodies. Since soil security issues have been arising in various areas of the world, intensive interests have been given to topsoil management in Korea. Thus in this study, Web GIS-based computing system of physical, chemical, and biological topsoil quality indices were developed. In this study, five soil quality maps at national scale and top soil erosion potential were prepared for evaluation of soil quality based on soil erosion potential. For this system, the open source Web GIS engine, OpenGeo, was used as core engine of the system. With this system, decision makers or related personnel in areas of soil erosion Best Management Practices (BMPs) would be able to find the most appropriate soil erosion BMPs based on soil erosion potential and soil quality at the area of interest. The Web GIS system would be efficiently used in decision making processes because of ease-of-use interface and scientific data used in this system. This Web GIS system would be efficiently used because this system could provide scientific knowledge to decision makers or stakeholders. Currently various BMP database are being built to be used as a decision support system in topsoil management and topsoil quality areas.

Personalized and Social Search by Finding User Similarity based on Social Networks (소셜 네트워크 기반 사용자 유사성 발견을 통한 개인화 및 소셜 검색)

  • Park, Gun-Woo;Oh, Jung-Woon;Lee, Sang-Hoon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.5
    • /
    • pp.683-690
    • /
    • 2009
  • Social Networks which is composed of network with an individual in the center in a web support mutual-understanding of information by searching user profile and forming new link. Therefore, if we apply the Social Network which consists of web users who have similar immanent information to web search, we can improve efficiency of web search and satisfaction of web user about search results. In this paper, first, we make a Social Network using web users linked directly or indirectly. Next, we calculate Similarity among web users using their immanent information according to topics, and then reconstruct Social Network based on varying Similarity according to topics. Last, we compare Similarity with Search Pattern. As a result of this test, we can confirm a result that among users who have high relationship index, that is, who have strong link strength according to personal attributes have similar search pattern. If such fact is applied to search algorithm, it can be possible to improve search efficiency and reliability in personalized and social search.

A Theoretical Study on Indexing Methods using the Metadata for the Automatic Construction of a Thesaurus Browser (시소러스 브라우저 자동구현을 위한 Metadata를 이용한 색인어 처리방안에 대한 연구)

  • Seo , Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.4
    • /
    • pp.451-467
    • /
    • 2004
  • This paper is intended to present the theoretical analyses on automatic indexing, which is vital in the process of constructing a thesaurus browser, and clustering algorithms to construct hierarchical relations among terms as well as the methods for the automatic construction of a thesaurus browser. The methods to select the index term automatically in the web documents are studied by surveying the methods for analyzing and processing metadata which conforms to bibliographical roles of traditional paper documents in web documents. Also, the result of the study suggests to adding or involving the metadata in web documents, using the metadata automatic editor because metadata is not listed in most of the web documents.

  • PDF

Efficient Blog Retrieval System by Topic-based Weighting (주제어 가중치 기법에 의한 효율적인 블로그 검색 시스템)

  • Shin, Hyeon-Il;Yun, Un-Il;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.4
    • /
    • pp.1-9
    • /
    • 2010
  • In the new generation of Web, commonly called "Web 2.0", blogging has facilitated the publishing information or his/her opinion on the web. Various blog retrieval algorithms have been proposed to search for blogs more effectively. However, actually keyword-based searching or link-analysis blog ranking system cannot satisfy the user's requirement. In this paper, we suggest a topic-based weighting blog retrieval system in which the links between blog writings and searching words are considered to improve the search results. Our system extracts topics from each blog and weights them much higher than other guide words. In the comparison with other systems, we see that the proposed topic-base system has better recall rate of search results.

A Study on Customer Characteristics in B2B Transactions Using Three-dimensional Positioning Map and Web-shape Customer Needs Analysis (B2B 거래에서 3차원 포지셔닝 맵과 웹 모양 고객 니즈 분석을 통한 고객 특성 연구)

  • Park, Chan-Ju;Park, Yunsun;Kim, Chang-Ouk;Joo, Sang-ho;Kim, Sun-il
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.28 no.3
    • /
    • pp.274-282
    • /
    • 2002
  • This paper discusses a multi-dimensional analysis for Customer Relationship Management (CRM). For this, We propose a decision-making methodology which employs three analysis models. The first model is a three-dimension positioning map to derive a strategy which achieves the Process Value Line (PVL). The second model is the web-shape analysis model to visibly understand the individual based on the customer CSI (Customer Satisfactory Index) data. The third model which supports the web-shape analysis model, is the relative satisfactory analysis model. It considers a satisfaction level after purchasing against before purchasing. Then we perform overall analysis based on the three analysis models to provide marketing strategies to decision makers.

An Agent System for Supporting Adaptive Web Surfing (적응형 웹 서핑 지원을 위한 에이전트 시스템)

  • Kook, Hyung-Joon
    • The KIPS Transactions:PartB
    • /
    • v.9B no.4
    • /
    • pp.399-406
    • /
    • 2002
  • The goal of this research has been to develop an adaptive user agent for web surfing. To achieve this goal, the research has concentrated on three issues: collection of user data, construction and improvement of user profile, and adaptation by applying the user profile. The main outcome from the research is a prototype system that provides the functional definition and componential design scheme for an adaptive user agent for the web environment. Internally, the system achieves its operational goal from the cooperation of two independent agents. They are IIA (Interactive Interface Agent) and UPA (User Profiling Agent). As a tool for providing a user-friendly interface environment, the IIA employs the Keyword Index, which is a list of index terms of a webpage as well as a keyword menu for subsequent queries, and the Suggest Link, which is a hierarchical list of URLs showing the past browsing procedure of the user. The UPA reflects in the User Profile, both the static and the dynamic information obtained from the user's browsing behavior. In particular, a user's interests are represented in the form of Interest Vectors which, based on the similarity of the vectors, is subject to update and creation, thus dynamically profiling the user's ever-shifting interests.