• Title/Summary/Keyword: data locality

Search Result 237, Processing Time 0.038 seconds

An Improvement in K-NN Graph Construction using re-grouping with Locality Sensitive Hashing on MapReduce (MapReduce 환경에서 재그룹핑을 이용한 Locality Sensitive Hashing 기반의 K-Nearest Neighbor 그래프 생성 알고리즘의 개선)

  • Lee, Inhoe;Oh, Hyesung;Kim, Hyoung-Joo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.11
    • /
    • pp.681-688
    • /
    • 2015
  • The k nearest neighbor (k-NN) graph construction is an important operation with many web-related applications, including collaborative filtering, similarity search, and many others in data mining and machine learning. Despite its many elegant properties, the brute force k-NN graph construction method has a computational complexity of $O(n^2)$, which is prohibitive for large scale data sets. Thus, (Key, Value)-based distributed framework, MapReduce, is gaining increasingly widespread use in Locality Sensitive Hashing which is efficient for high-dimension and sparse data. Based on the two-stage strategy, we engage the locality sensitive hashing technique to divide users into small subsets, and then calculate similarity between pairs in the small subsets using a brute force method on MapReduce. Specifically, generating a candidate group stage is important since brute-force calculation is performed in the following step. However, existing methods do not prevent large candidate groups. In this paper, we proposed an efficient algorithm for approximate k-NN graph construction by regrouping candidate groups. Experimental results show that our approach is more effective than existing methods in terms of graph accuracy and scan rate.

A Multi-dimensional Query Processing Scheme for Stream Data using Range Query Indexing (범위 질의 인덱싱을 이용한 스트림 데이터의 다중 질의처리 기법)

  • Lee, Dong-Un;Rhee, Yun-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.2
    • /
    • pp.69-77
    • /
    • 2009
  • Stream service environment demands real-time query processing for voluminous data which are ceaselessly delivered from tremendous sources. Typical R-tree based query processing technologies cannot efficiently handle such situations, which require repetitive and inefficient exploration from the tree root on every data event. However, many stream data including sensor readings show high locality, which we exploit to reduce the search space of queries to explore. In this paper, we propose a query processing scheme exploiting the locality of stream data. From the simulation, we conclude that the proposed scheme performs much better than the traditional ones in terms of scalability and exploration efficiency.

A Caching Scheme to Support Session Locality in Hierarchical SIP Networks

  • Choi, KwangHee;Kim, Hyunwoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.18 no.1
    • /
    • pp.1-9
    • /
    • 2013
  • Most calls of a called user are invoked by the group of calling users. This call pattern is defined as call locality. Similarly Internet sessions including IP telephony calls have this pattern. We define it session locality. In this paper, we propose a caching scheme to support session locality in hierarchical SIP networks. The proposed scheme can be applied easily by adding only one filed to cache to a data structure of the SIP mobility agent. And this scheme can reduce signaling cost, database access cost and session setup delay to locate a called user. Moreover, it distributes the load on the home registrar to the SIP mobility agents. Our performance evaluation shows the proposed caching scheme outperforms the hierarchical SIP scheme when session to mobility ratio is high.

A Dynamic Locality Sensitive Hashing Algorithm for Efficient Security Applications

  • Mohammad Y. Khanafseh;Ola M. Surakhi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.79-88
    • /
    • 2024
  • The information retrieval domain deals with the retrieval of unstructured data such as text documents. Searching documents is a main component of the modern information retrieval system. Locality Sensitive Hashing (LSH) is one of the most popular methods used in searching for documents in a high-dimensional space. The main benefit of LSH is its theoretical guarantee of query accuracy in a multi-dimensional space. More enhancement can be achieved to LSH by adding a bit to its steps. In this paper, a new Dynamic Locality Sensitive Hashing (DLSH) algorithm is proposed as an improved version of the LSH algorithm, which relies on employing the hierarchal selection of LSH parameters (number of bands, number of shingles, and number of permutation lists) based on the similarity achieved by the algorithm to optimize searching accuracy and increasing its score. Using several tampered file structures, the technique was applied, and the performance is evaluated. In some circumstances, the accuracy of matching with DLSH exceeds 95% with the optimal parameter value selected for the number of bands, the number of shingles, and the number of permutations lists of the DLSH algorithm. The result makes DLSH algorithm suitable to be applied in many critical applications that depend on accurate searching such as forensics technology.

A Study on the Rural Landscape and Locality according to the Community Planning - Focused on the Daewon Ri Sanoe Myeon Boeun Gun Chungbuk - (마을계획에 따른 농촌경관과 지역성 고찰 - 충북 보은군 산외면 대원리를 중심으로 -)

  • Park, Heon-Choon;Kim, Seung-Geun
    • Journal of the Korean Institute of Rural Architecture
    • /
    • v.10 no.4
    • /
    • pp.65-72
    • /
    • 2008
  • Recently, has been the subject of high interest in rural areas. However it incite the damage to rural landscape and destruction of the locality. The reason is the economic logic that potential was ignored of community. In the meantime, has thrown out the importance to restore damaged rural landscape. So, pleasant natural environment and community resources to create value. Therefore, the purpose of this research, redefine the value of the future of rural landscape that the community design and community planning is to offer basic data. The results of study are as follows; First, the landscape that all human perception through the senses, so the community planning to landscape the locality of the formation is a very important element. Second, if it build a new building would have to find space in community. So, the plan must reflect the community and the locality, If so the landscape of th community, the building will be imbibe naturally. Third, the rural area of the rural community when planning for past, present and future should be thoroughly analyzed, and for the future direction of the community will be set up. Finally, analyzing correctly past of the community to reconfigure the rural community is very important. When configure the space of rural community, if created the new space based on original characteristic of the community, the landscape will be kept in rural community.

  • PDF

2D-MELPP: A two dimensional matrix exponential based extension of locality preserving projections for dimensional reduction

  • Xiong, Zixun;Wan, Minghua;Xue, Rui;Yang, Guowei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.9
    • /
    • pp.2991-3007
    • /
    • 2022
  • Two dimensional locality preserving projections (2D-LPP) is an improved algorithm of 2D image to solve the small sample size (SSS) problems which locality preserving projections (LPP) meets. It's able to find the low dimension manifold mapping that not only preserves local information but also detects manifold embedded in original data spaces. However, 2D-LPP is simple and elegant. So, inspired by the comparison experiments between two dimensional linear discriminant analysis (2D-LDA) and linear discriminant analysis (LDA) which indicated that matrix based methods don't always perform better even when training samples are limited, we surmise 2D-LPP may meet the same limitation as 2D-LDA and propose a novel matrix exponential method to enhance the performance of 2D-LPP. 2D-MELPP is equivalent to employing distance diffusion mapping to transform original images into a new space, and margins between labels are broadened, which is beneficial for solving classification problems. Nonetheless, the computational time complexity of 2D-MELPP is extremely high. In this paper, we replace some of matrix multiplications with multiple multiplications to save the memory cost and provide an efficient way for solving 2D-MELPP. We test it on public databases: random 3D data set, ORL, AR face database and Polyu Palmprint database and compare it with other 2D methods like 2D-LDA, 2D-LPP and 1D methods like LPP and exponential locality preserving projections (ELPP), finding it outperforms than others in recognition accuracy. We also compare different dimensions of projection vector and record the cost time on the ORL, AR face database and Polyu Palmprint database. The experiment results above proves that our advanced algorithm has a better performance on 3 independent public databases.

A Technique for Improving the Performance of Cache Memories

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.104-108
    • /
    • 2021
  • In order to improve performance in IoT, edge computing system, a memory is usually configured in a hierarchical structure. Based on the distance from CPU, the access speed slows down in the order of registers, cache memory, main memory, and storage. Similar to the change in performance, energy consumption also increases as the distance from the CPU increases. Therefore, it is important to develop a technique that places frequently used data to the upper memory as much as possible to improve performance and energy consumption. However, the technique should solve the problem of cache performance degradation caused by lack of spatial locality that occurs when the data access stride is large. This study proposes a technique to selectively place data with large data access stride to a software-controlled cache. By using the proposed technique, data spatial locality can be improved by reducing the data access interval, and consequently, the cache performance can be improved.

The Relationship between Subjective Happiness and Satisfaction of Social Sustainability in Residential Environment (주거환경의 사회적 지속가능성 만족도와 주관적 행복감과의 관계)

  • Shin, Hwa-Kyoung;Jo, In-Sook
    • Journal of the Korean housing association
    • /
    • v.26 no.2
    • /
    • pp.57-66
    • /
    • 2015
  • The purpose of the study was to find out the relationship between subjective happiness and satisfaction of social sustainability in residential environment. The data for the analysis were collected through questionnaire survey method from October 29 to November 10, 2013, and the sample consisted of 338 residents living in Seoul and Gyeonggi-Do province. The social sustianability was composed of locality, communality and organism. Locality composed of historical and cultural reflection of regional identity and of regions. Communality composed of social integration, community program and facilities. Organism composed of employment, self-sufficiency, welfare, population, safety and housing. The findings of the study were as followings: 1) The average of subjective happiness was 3.82 points, over neutral. 2) The social sustainability in residential environment was related with the subjective happiness. 3) In the social sustainability in residential environment, the residents was satisfied with locality and organism, but they was not satisfied with communality.

Cost-effective multistage interconnection network for UNMA model system (NUMA(non-uniform memory access) 모델 시스템을 위한 cost-effective한 다단계 상호연결망)

  • 최창훈;김성천
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.5
    • /
    • pp.19-32
    • /
    • 1997
  • So far, the multiple path MINs to provide redundant paths in the traditional UPP MINs have been realized by adding additional hardware such as extra stages, duplicated data links, or multiple copies of sthe MIN. And the traditional MINs do not exploit locality: communication with all processor-memory paris takes the same amount of time. Also so far there has been little progress for exploiting locality of reference in MINs. In this paper, we present a new topology MIN, hybrid MIN that is constructed with 2N-3 SEs which is far fewer SEs than that of traditional MINs. Although the hybrid MIN is constructed with 2N-3 SEs, the hybrid MIN satisfies full access capability (FAC) and has redundant paths(but providing single path for 2 memory modules of each processor). Moreover the has redundant paths (but providing single path for 2 memory modules of each processor). Moreover the Hybrid MIN provides shortcut path between pairs which have frequent dat acommunication (locality of reference). Its performance under varing degrees of localized communication is analyzed.

  • PDF

A Study of Efficient Access Method based upon the Spatial Locality of Multi-Dimensional Data

  • Yoon, Seong-young;Joo, In-hak;Choy, Yoon-chul
    • Proceedings of the Korea Database Society Conference
    • /
    • 1997.10a
    • /
    • pp.472-482
    • /
    • 1997
  • Multi-dimensional data play a crucial role in various fields, as like computer graphics, geographical information system, and multimedia applications. Indexing method fur multi-dimensional data Is a very Important factor in overall system performance. What is proposed in this paper is a new dynamic access method for spatial objects called HL-CIF(Hierarchically Layered Caltech Intermediate Form) tree which requires small amount of storage space and facilitates efficient query processing. HL-CIF tree is a combination of hierarchical management of spatial objects and CIF tree in which spatial objects and sub-regions are associated with representative points. HL-CIF tree adopts "centroid" of spatial objects as the representative point. By reflecting objects′sizes and positions in its structure, HL-CIF tree guarantees the high spatial locality of objects grouped in a sub-region rendering query processing more efficient.

  • PDF