• Title/Summary/Keyword: Distributed Indexing

Search Result 47, Processing Time 0.022 seconds

Efficient k-Nearest Neighbor Query Processing Method for a Large Location Data (대용량 위치 데이터에서 효율적인 k-최근접 질의 처리 기법)

  • Choi, Dojin;Lim, Jongtae;Yoo, Seunghun;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.8
    • /
    • pp.619-630
    • /
    • 2017
  • With the growing popularity of smart devices, various location based services have been providing to users. Recently, some location based social applications that combine social services and location based services have been emerged. The demands of a k-nearest neighbors(k-NN) query which finds k closest locations from a user location are increased in the location based social network services. In this paper, we propose an approximate k-NN query processing method for fast response time in a large number of users environments. The proposed method performs efficient stream processing using big data distributed processing technologies. In this paper, we also propose a modified grid index method for indexing a large amount of location data. The proposed query processing method first retrieves the related cells by considering a user movement. By doing so, it can make an approximate k results set. In order to show the superiority of the proposed method, we conduct various performance evaluations with the existing method.

A Design and Implementation of Dynamic Hybrid P2P System with Hierarchical Group Management and Maintenance of Reliability (계층적 그룹관리와 신뢰성을 위한 동적인 변형 P2P 시스템 설계 및 구현)

  • Lee, Seok-Hee;Cho, Sang;Kim, Sung-Yeol
    • The KIPS Transactions:PartD
    • /
    • v.11D no.4
    • /
    • pp.975-982
    • /
    • 2004
  • In current P2P concept, pure P2P and Hybrid P2P structures are used commonly. Gnutella and Ktella are forms of pure P2P. and forms of Hybrid P2P are innumerable. File searching models exist in these models. These models provide group management for file sharing, searching and indexing. The general file sharing model is good at maintaining connectivity. However, it is defective in group management. Therefore, this study approaches hierarchical structure in file sharing models through routing technique and backup system. This system was designed so that the user was able to maintain group efficiency and connection reliability in large-scale network.

A Study on Evaluation for the Han River Water Quality Index (한강의 수질지수 산정에 관한 연구)

  • 서정현
    • Water for future
    • /
    • v.14 no.3
    • /
    • pp.55-66
    • /
    • 1981
  • The theory and practice of water quality scoring and indexing are introduced. The monthly water analysis data are available for six stations long the down-stream Han River whthin the areal boundary of the Special City of Seoul. The data cover the period between 1975 and 1979 inclusive and contain the analytical findings on 37 water constituents including DO, BOD, temperature, total solids and etc. Sic parameters are selected form the 37 items, that, to the judgement of the writer, best reflect the water quality of the Han River. They are; dissolved oxggen saturation, pH, fecal coliform, total solids, BOD and nitrate+ammonia. For each of the six parameters, a subscore function is developed and graphically presented to facilitate the transform of a measurment of the arameter to a subscore on a common score(e.G. 0-100) The score of a sample is calculated as a fuction of the six subscores, using four different approaches; (1) the unweighted arithmetic water quality score, (2) the weighted arithmetic water quality score, (3)the unweighted multiplicative score and (4) the reduced (total) score. Independent of these calculated scores, the experts' score which is calculated by averaging the ratings of water quality experts is obtained and compared with each of the four calculated scores by means of the least square method. The experts' score compares most favorably with the "reduced" score with the correlation coefficient of 0.956 : therefore this method of water quality scoring is adopted to calculate the Han River water quality scores and indices. Water quality index data for Guiri, ukdo, Pokwangdong, Noryangjin, Yongdungpo and Kayang Stations, 1975-1979 are as follow: The overall water quality index data of the Han River between Guiri and Kayang Stations are found; 47.3 in 1976, 48.0 in 1977, 48.5 in 1978 and 54.7 in 1979, indicating the general trend towards water quality improvent in this part of the river, in terms of the increased water quality index by average 1.85 points per year during this period. Finally the optimum sampling frequencies distributed among the six stations, using an equation which takes into account the coefficients of variation of the water quality scores and indices arec calculated.alculated.

  • PDF

Development of Robust Feature Recognition and Extraction Algorithm for Dried Oak Mushrooms (건표고의 외관특징 인식 및 추출 알고리즘 개발)

  • Lee, C.H.;Hwang, H.
    • Journal of Biosystems Engineering
    • /
    • v.21 no.3
    • /
    • pp.325-335
    • /
    • 1996
  • Visual features are crucial for monitoring the growth state, indexing the drying performance, and grading the quality of oak mushrooms. A computer vision system with neural net information processing technique was utilized to quantize quality factors of a dried oak mushrooms distributed over the cap and gill sides. In this paper, visual feature extraction algorithm were integrated with the neural net processing to deal with various fuzzy patterns of mushroom shapes and to compensate the fault sensitiveness of the crisp criteria and heuristic rules derived from the image processing results. The proposed algorithm improved the segmentation of the skin features of each side, the identification of cap and gill surfaces, the identification of stipe states and removal of the stipe, etc. And the visual characteristics of dried oak mushrooms were analyzed and primary visual features essential to tile quality evaluation were extracted and quantized. In this study, black and white gray images were captured and used for the algorithm development.

  • PDF

A Spatial Split Method for Processing of Region Monitoring Queries (영역 모니터링 질의 처리를 위한 공간 분할 기법)

  • Chung, Jaewoo;Jung, HaRim;Kim, Ung-Mo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.67-76
    • /
    • 2018
  • This paper addresses the problem of efficient processing of region monitoring queries. The centralized methods used for existing region monitoring query processing assumes that the mobile object periodically sends location-updates to the server and the server continues to update the query results. However, a large amount of location updates seriously degrade the system performance. Recently, some distributed methods have been proposed for region monitoring query processing. In the distributed methods, the server allocates to all objects i) a resident domain that is a subspace of the workspace, and ii) a number of nearby query regions. All moving objects send location updates to the server only when they leave the resident domain or cross the boundary of the query region. In order to allocate the resident domain to the moving object along with the nearby query region, we use a query index structure that is constructed by splitting the workspace recursively into equal halves. However, However, the above index structure causes unnecessary division, resulting in deterioration of system performance. In this paper, we propose an adaptive split method to reduce unnecessary splitting. The workspace splitting is dynamically allocated i) considering the spatial relationship between the query region and the resultant subspace, and ii) the distribution of the query region. We proposed an enhanced QR-tree with a new splitting method. Through a set of simulations, we verify the efficiency of the proposed split methods.

Design and Implementation of an Open Object Management System for Spatial Data Mining (공간 데이타 마이닝을 위한 개방형 객체 관리 시스템의 설계 및 구현)

  • Yun, Jae-Kwan;Oh, Byoung-Woo;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.1 no.1 s.1
    • /
    • pp.5-18
    • /
    • 1999
  • Recently, the necessity of automatic knowledge extraction from spatial data stored in spatial databases has been increased. Spatial data mining can be defined as the extraction of implicit knowledge, spatial relationships, or other knowledge not explicitly stored in spatial databases. In order to extract useful knowledge from spatial data, an object management system that can store spatial data efficiently, provide very fast indexing & searching mechanisms, and support a distributed computing environment is needed. In this paper, we designed and implemented an open object management system for spatial data mining, that supports efficient management of spatial, aspatial, and knowledge data. In order to develop this system, we used Open OODB that is a widely used object management system. However, the lark of facilities for spatial data mining in Open OODB, we extended it to support spatial data type, dynamic class generation, object-oriented inheritance, spatial index, spatial operations, etc. In addition, for further increasement of interoperability with other spatial database management systems or data mining systems, we adopted international standards such as ODMG 2.0 for data modeling, SDTS(Spatial Data Transfer Standard) for modeling and exchanging spatial data, and OpenGIS Simple Features Specification for CORBA for connecting clients and servers efficiently.

  • PDF

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.