• Title/Summary/Keyword: k-NN Search

Search Result 43, Processing Time 0.031 seconds

Efficient Multi-Step k-NN Search Methods Using Multidimensional Indexes in Large Databases (대용량 데이터베이스에서 다차원 인덱스를 사용한 효율적인 다단계 k-NN 검색)

  • Lee, Sanghun;Kim, Bum-Soo;Choi, Mi-Jung;Moon, Yang-Sae
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.242-254
    • /
    • 2015
  • In this paper, we address the problem of improving the performance of multi-step k-NN search using multi-dimensional indexes. Due to information loss by lower-dimensional transformations, existing multi-step k-NN search solutions produce a large tolerance (i.e., a large search range), and thus, incur a large number of candidates, which are retrieved by a range query. Those many candidates lead to overwhelming I/O and CPU overheads in the postprocessing step. To overcome this problem, we propose two efficient solutions that improve the search performance by reducing the tolerance of a range query, and accordingly, reducing the number of candidates. First, we propose a tolerance reduction-based (approximate) solution that forcibly decreases the tolerance, which is determined by a k-NN query on the index, by the average ratio of high- and low-dimensional distances. Second, we propose a coefficient control-based (exact) solution that uses c k instead of k in a k-NN query to obtain a tigher tolerance and performs a range query using this tigher tolerance. Experimental results show that the proposed solutions significantly reduce the number of candidates, and accordingly, improve the search performance in comparison with the existing multi-step k-NN solution.

A Density-Based K-Nearest Neighbors Search Method

  • Jang I. S.;Min K.W.;Choi W.S
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.260-262
    • /
    • 2004
  • Spatial database system provides many query types and most of them are required frequent disk I/O and much CPU time. k-NN search is to find k-th closest object from the query point and up to now, several k-NN search methods have been proposed. Among these, MINMAX distance method has an aim not to visit unnecessary node by applying pruning technique. But this method access more disk than necessary while pruning unnecessary node. In this paper, we propose new k-NN search algorithm based on density of object. With this method, we predict the radius to be expected to contain k-NN object using density of data set and search those objects within this radius and then adjust radius if failed. Experimental results show that this method outperforms the previous MINMAX distance method. This algorithm visit fewer disks than MINMAX method by the factor of maximum $22\%\;and\;average\;6\%.$

  • PDF

Speeding Up Neural Network-Based Face Detection Using Swarm Search

  • Sugisaka, Masanori;Fan, Xinjian
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1334-1337
    • /
    • 2004
  • This paper presents a novel method to speed up neural network (NN) based face detection systems. NN-based face detection can be viewed as a classification and search problem. The proposed method formulates the search problem as an integer nonlinear optimization problem (INLP) and expands the basic particle swarm optimization (PSO) to solve it. PSO works with a population of particles, each representing a subwindow in an input image. The subwindows are evaluated by how well they match a NN-based face filter. A face is indicated when the filter response of the best particle is above a given threshold. To achieve better performance, the influence of PSO parameter settings on the search performance was investigated. Experiments show that with fine-adjusted parameters, the proposed method leads to a speedup of 94 on 320${\times}$240 images compared to the traditional exhaustive search method.

  • PDF

Fast k-NN based Malware Analysis in a Massive Malware Environment

  • Hwang, Jun-ho;Kwak, Jin;Lee, Tae-jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.6145-6158
    • /
    • 2019
  • It is a challenge for the current security industry to respond to a large number of malicious codes distributed indiscriminately as well as intelligent APT attacks. As a result, studies using machine learning algorithms are being conducted as proactive prevention rather than post processing. The k-NN algorithm is widely used because it is intuitive and suitable for handling malicious code as unstructured data. In addition, in the malicious code analysis domain, the k-NN algorithm is easy to classify malicious codes based on previously analyzed malicious codes. For example, it is possible to classify malicious code families or analyze malicious code variants through similarity analysis with existing malicious codes. However, the main disadvantage of the k-NN algorithm is that the search time increases as the learning data increases. We propose a fast k-NN algorithm which improves the computation speed problem while taking the value of the k-NN algorithm. In the test environment, the k-NN algorithm was able to perform with only the comparison of the average of similarity of 19.71 times for 6.25 million malicious codes. Considering the way the algorithm works, Fast k-NN algorithm can also be used to search all data that can be vectorized as well as malware and SSDEEP. In the future, it is expected that if the k-NN approach is needed, and the central node can be effectively selected for clustering of large amount of data in various environments, it will be possible to design a sophisticated machine learning based system.

An Improved Genetic Algorithm for Fast Face Detection Using Neural Network as Classifier

  • Sugisaka, Masanori;Fan, Xinjian
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1034-1038
    • /
    • 2005
  • This paper presents a novel method to speed up neural network (NN) based face detection systems. NN-based face detection can be viewed as a classification and search problem. The proposed method formulates the search problem as an integer nonlinear optimization problem (INLP) and develops an improved genetic algorithm (IGA) to solve it. Each individual in the IGA represents a subwindow in an input image. The subwindows are evaluated by how well they match a NN-based face filter. A face is indicated when the filter response of the best particle is above a given threshold. Experimental results show that the proposed method leads to a speedup of 83 on $320{\times}240$ images compared to the traditional exhaustive search method.

  • PDF

The Method of Continuous Nearest Neighbor Search on Trajectory of Moving Objects

  • Park, Bo-Yoon;Kim, Sang-Ho;Nam, Kwang-Woo;Ryo, Keun-Ho
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.467-470
    • /
    • 2003
  • When user wants to find objects which have the nearest position from him, we use the nearest neighbor (NN) query. The GIS applications, such as navigation system and traffic control system, require processing of NN query for moving objects (MOs). MOs have trajectory with changing their position over time. Therefore, we should be able to find NN object continuously changing over the whole query time when process NN query for MOs, as well as moving nearby on trajectory of query. However, none of previous works consider trajectory information between objects. Therefore, we propose a method of continuous NN query for trajectory of MOs. We call this CTNN (continuous trajectory NN) technique. It ran find constantly valid NN object on the whole query time by considering of trajectory information.

  • PDF

SOMk-NN Search Algorithm for Content-Based Retrieval (내용기반 검색을 위한 SOMk-NN탐색 알고리즘)

  • O, Gun-Seok;Kim, Pan-Gu
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.358-366
    • /
    • 2002
  • Feature-based similarity retrieval become an important research issue in image database systems. The features of image data are useful to discrimination of images. In this paper, we propose the high speed k-Nearest Neighbor search algorithm based on Self-Organizing Maps. Self-Organizing Maps(SOM) provides a mapping from high dimensional feature vectors onto a two-dimensional space and generates a topological feature map. A topological feature map preserves the mutual relations (similarities) in feature spaces of input data, and clusters mutually similar feature vectors in a neighboring nodes. Therefore each node of the topological feature map holds a node vector and similar images that is closest to each node vector. We implemented a k-NN search for similar image classification as to (1) access to topological feature map, and (2) apply to pruning strategy of high speed search. We experiment on the performance of our algorithm using color feature vectors extracted from images. Promising results have been obtained in experiments.

k-NN Join Based on LSH in Big Data Environment

  • Ji, Jiaqi;Chung, Yeongjee
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.2
    • /
    • pp.99-105
    • /
    • 2018
  • k-Nearest neighbor join (k-NN Join) is a computationally intensive algorithm that is designed to find k-nearest neighbors from a dataset S for every object in another dataset R. Most related studies on k-NN Join are based on single-computer operations. As the data dimensions and data volume increase, running the k-NN Join algorithm on a single computer cannot generate results quickly. To solve this scalability problem, we introduce the locality-sensitive hashing (LSH) k-NN Join algorithm implemented in Spark, an approach for high-dimensional big data. LSH is used to map similar data onto the same bucket, which can reduce the data search scope. In order to achieve parallel implementation of the algorithm on multiple computers, the Spark framework is used to accelerate the computation of distances between objects in a cluster. Results show that our proposed approach is fast and accurate for high-dimensional and big data.

A Density-based k-Nearest Neighbors Query Method (밀도 기반의 k-최근접 질의 처리)

  • Jang, In-Sung;Han, Eun-Young;Cho, Dae-Soo
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.6 no.4
    • /
    • pp.59-70
    • /
    • 2003
  • Spatial data base system provides many query types and most of them are required frequent disk I/O and much CPU time. k-NN search is to find k-th closest object from the query point and up to now, several k-NN search methods have been proposed. Among these, MINMAX distance method has an aim not to access unnecessary node by adapting pruning technique. But this method accesses more disks than necessary while pruning unnecessary nodes. In this paper, we propose new k-NN search algorithm based on density of object. With this method, we predict the radius to be expected to contain k-NN objects using density of data set and search those objects within this radius and then adjust radius if failed. Experimental results show that this method outperforms the previous MINMAX distance method. This algorithm visit less disks than MINMAX method by the factor of maximum 22% and average 7%.

  • PDF

Flexible Nearest Neighbor Search for Grouping kNN (그룹핑 k-NN을 위한 유연한 최근접 객체 검색)

  • Song, Doohee;Park, Kwangjin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.469-470
    • /
    • 2015
  • 우리는 그룹핑 k-최근접 (Grouping k Nearest Neighbor; GkNN)질의를 지원하기 위하여 유연한 최근접객체(Flexible Nearest Neighbor; FNN)검색 방법을 제안한다. GkNN이란 기존에 제안된 kNN과 다르게 질의자가 요청한 k개의 객체를 모두 확인한 후에 이동 경로의 총합이 가장 작은 k개의 객체를 검색하는 방법이다. 기존 연구에서 제안된 최근접 객체들 (Nearest Neighborhood; NNH) 또한 이 문제를 해결하기 위하여 제안되었다. 그러나 NNH의 문제점은 객체 k와 p가 고정되어 있기 때문에 이동 환경에서 q에서 C까지의 거리가 증가하는 것이다. FNN의 환경은 NNH의 환경과 유사하다. 우리는 NNH의 q에서 집합 C 중 거리 중 가장 짧은 $c_i$ 선택한 후 q에서 $c_i$에 포함된 객체들 모두 검색하는 이동 경로의 총합과 FNN의 이동경로의 총 합을 비교하여 NNH의 문제점을 해결하였다.