• Title/Summary/Keyword: 최근접 이웃 방법

Search Result 108, Processing Time 0.021 seconds

A Hashing Method Using PCA-based Clustering (PCA 기반 군집화를 이용한 해슁 기법)

  • Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.6
    • /
    • pp.215-218
    • /
    • 2014
  • In hashing-based methods for approximate nearest neighbors(ANN) search, by mapping data points to k-bit binary codes, nearest neighbors are searched in a binary embedding space. In this paper, we present a hashing method using a PCA-based clustering method, Principal Direction Divisive Partitioning(PDDP). PDDP is a clustering method which repeatedly partitions the cluster with the largest variance into two clusters by using the first principal direction. The proposed hashing method utilizes the first principal direction as a projective direction for binary coding. Experimental results demonstrate that the proposed method is competitive compared with other hashing methods.

Suspicious Process Detection Based on Nearest Neighbors (최근접 이웃 방법에 기반한 비정상 프로세스의 검출)

  • Dongho Jeong;Sangchul Song;Sang-Wook Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.392-393
    • /
    • 2023
  • 매년 급증하는 악성코드(malware)로 인해 기업, 공공기관 등 다수의 PC가 있는 대상까지 피해 사례가 늘고 있다. 악성코드로 인한 침해사고 흔적에서 비정상적인 동작을 한 프로세스를 찾는 기술은 해당 PC의 침해 여부 판단, 사후 대응 등 사이버 보안에 기여할 수 있을 것이다. 본 연구에서는 최근접 이웃 방법을 활용하여 시스템 메모리 데이터에서 비정상 프로세스를 검출하는 방안을 제시한다. 또한 실험을 통해 제안 방법이 정확도 및 여러 지표에서 우수한 성능을 달성함을 보였다.

Prototype based Classification by Generating Multidimensional Spheres per Class Area (클래스 영역의 다차원 구 생성에 의한 프로토타입 기반 분류)

  • Shim, Seyong;Hwang, Doosung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.21-28
    • /
    • 2015
  • In this paper, we propose a prototype-based classification learning by using the nearest-neighbor rule. The nearest-neighbor is applied to segment the class area of all the training data into spheres within which the data exist from the same class. Prototypes are the center of spheres and their radii are computed by the mid-point of the two distances to the farthest same class point and the nearest another class point. And we transform the prototype selection problem into a set covering problem in order to determine the smallest set of prototypes that include all the training data. The proposed prototype selection method is based on a greedy algorithm that is applicable to the training data per class. The complexity of the proposed method is not complicated and the possibility of its parallel implementation is high. The prototype-based classification learning takes up the set of prototypes and predicts the class of test data by the nearest neighbor rule. In experiments, the generalization performance of our prototype classifier is superior to those of the nearest neighbor, Bayes classifier, and another prototype classifier.

k-Nearest Neighbor Learning with Varying Norms (놈(Norm)에 따른 k-최근접 이웃 학습의 성능 변화)

  • Kim, Doo-Hyeok;Kim, Chan-Ju;Hwang, Kyu-Baek
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.371-375
    • /
    • 2008
  • 예제 기반 학습(instance-based learning) 방법 중 하나인 k-최근접 이웃(k-nearest reighbor, k-NN) 학습은 간단하고 예측 정확도가 비교적 높아 분류 및 회귀 문제 해결을 위한 기반 방법론으로 널리 적용되고 있다. k-NN 학습을 위한 알고리즘은 기본적으로 유클리드 거리 혹은 2-놈(norm)에 기반하여 학습예제들 사이의 거리를 계산한다. 본 논문에서는 유클리드 거리를 일반화한 개념인 p-놈의 사용이 k-NN 학습의 성능에 어떠한 영향을 미치는지 연구하였다. 구체적으로 합성데이터와 다수의 기계학습 벤치마크 문제 및 실제 데이터에 다양한 p-놈을 적용하여 그 일반화 성능을 경험적으로 조사하였다. 실험 결과, 데이터에 잡음이 많이 존재하거나 문제가 어려운 경우에 p의 값을 작게 하는 것이 성능을 향상시킬 수 있었다.

  • PDF

Fast Access Method of Neighboring Particles Using Bitonic Sort Based GPU Hashing, and Its Applications (바이토닉 정렬 기반의 GPU 해싱을 이용한 인접 입자의 빠른 접근 기법과 그 응용 사례)

  • Lee, SuBin;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.357-360
    • /
    • 2022
  • 본 논문에서는 대용량 데이터에서 빠르게 주변 데이터를 접근하기 위한 자료구조인 최근접 이웃 탐색(Nearest neighbor search, NNS) 문제를 빠르게 풀 수 있는 바이토닉 정렬(Bitonic sort) 기반 해시 테이블을 GPU기반에서 설계하는 방법과 이를 통해 입자 기반 물리 시뮬레이션을 고속화할 수 있는 방법에 대해 살펴본다. 본 논문에서는 CUDA 아키텍처를 이용하여 해시 테이블을 설계하였으며, 계산양이 가장 큰 데이터 정렬부분을 최적화함으로써 NVIDIA에서 제공하는 CUDA 해시 테이블보다 빠른 결과를 얻을 수 있으며, 이 자료구조를 입자 기반 시뮬레이션에 통합함으로써 고성능 시뮬레이션을 쉽게 제작할 수 있다.

  • PDF

Location Estimation Method Employing Fingerprinting Scheme based on K-Nearest Neighbor Algorithm under WLAN Environment of Ship (선박의 WLAN 환경에서 K-최근접 이웃 알고리즘 기반 Fingerprinting 방식을 적용한 위치 추정 방법)

  • Kim, Beom-Mu;Jeong, Min A;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2530-2536
    • /
    • 2014
  • Many studies have been made on location estimation under indoor environments which GPS signals do not reach, and, as a result, a variety of estimation methods have been proposed. In this paper, we deeply consider a problem of location estimation in a ship with a multi-story structure, and investigate a location estimation method using the fingerprint scheme based on the K-Nearest Neighbor algorithm. A reliable DB is constructed by measuring 100 received signals at each of 39 RPs in order to employ the fingerprint scheme, and, based on the DB, a simulation to estimate the location of a randomly-positioned terminal is performed. The simulation result confirms that the performance of location estimation by the fingerprint scheme is quite satisfactory.

A study on the spatial neighborhood in spatial regression analysis (공간이웃정보를 고려한 공간회귀분석)

  • Kim, Sujung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.505-513
    • /
    • 2017
  • Recently, numerous small area estimation studies have been conducted to obtain more detailed and accurate estimation results. Most of these studies have employed spatial regression models, which require a clear definition of spatial neighborhoods. In this study, we introduce the Delaunay triangulation as a method to define spatial neighborhood, and compare this method with the k-nearest neighbor method. A simulation was conducted to determine which of the two methods is more efficient in defining spatial neighborhood, and we demonstrate the performance of the proposed method using a land price data.

k-Nearest Neighbor Querv Processing using Approximate Indexing in Road Network Databases (도로 네트워크 데이타베이스에서 근사 색인을 이용한 k-최근접 질의 처리)

  • Lee, Sang-Chul;Kim, Sang-Wook
    • Journal of KIISE:Databases
    • /
    • v.35 no.5
    • /
    • pp.447-458
    • /
    • 2008
  • In this paper, we address an efficient processing scheme for k-nearest neighbor queries to retrieve k static objects in road network databases. Existing methods cannot expect a query processing speed-up by index structures in road network databases, since it is impossible to build an index by the network distance, which cannot meet the triangular inequality requirement, essential for index creation, but only possible in a totally ordered set. Thus, these previous methods suffer from a serious performance degradation in query processing. Another method using pre-computed network distances also suffers from a serious storage overhead to maintain a huge amount of pre-computed network distances. To solve these performance and storage problems at the same time, this paper proposes a novel approach that creates an index for moving objects by approximating their network distances and efficiently processes k-nearest neighbor queries by means of the approximate index. For this approach, we proposed a systematic way of mapping each moving object on a road network into the corresponding absolute position in the m-dimensional space. To meet the triangular inequality this paper proposes a new notion of average network distance, and uses FastMap to map moving objects to their corresponding points in the m-dimensional space. After then, we present an approximate indexing algorithm to build an R*-tree, a multidimensional index, on the m-dimensional points of moving objects. The proposed scheme presents a query processing algorithm capable of efficiently evaluating k-nearest neighbor queries by finding k-nearest points (i.e., k-nearest moving objects) from the m-dimensional index. Finally, a variety of extensive experiments verifies the performance enhancement of the proposed approach by performing especially for the real-life road network databases.

A Movie Recommender Systems using Personal Disposition in Hadoop (하둡에서 개인 성향을 이용한 영화 추천시스템)

  • Kim, Sun-Ho;Kim, Se-Jun;Mo, Ha-Young;Kim, Chae-Reen;Park, Gyu-Tae;Park, Doo-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.642-644
    • /
    • 2014
  • 정보의 폭발적인 증가로 인해 사용자들은 오히려 원하는 정보를 빠른 시간에 얻는 것이 힘들어졌다. 따라서 이 문제를 해결하기 위한 다양한 방식의 새로운 서비스들이 제공되고 있다. 추천 시스템 중에서 영화를 추천해주는 방법에는 사용되는 알고리즘에는 협업필터링 방법이 가장 성공한 알고리즘으로 사용되고 있다. 협업 필터링 방법은 사용자가 자발적으로 입력한 선호도 평가치를 바탕으로 추천 하고자 하는 사용자와 취향이 비슷하다고 판단되는 사람들 즉, 최근접 이웃을 구하고 최근접 이웃의 선호도 평가치를 바탕으로 사용자에게 영화를 추천을 해주는 기법이다. 그러나 협업 필터링에는 몇 가지 대표적인 문제점이 있으며 희박성 및 확장성, 투명성이 있다. 본 논문에서는 영화 추천 시스템에서의 협업필터링의 희박성 문제를 보완하고자 개개인의 성향을 반영하여 효율이 좋은 추천 방법을 제안하고 하둡에서 성능평가를 하였다.

Efficient Nearest Neighbor Search on Moving Object Trajectories (이동객체궤적에 대한 효율적인 최근접이웃검색)

  • Kim, Gyu-Jae;Park, Young-Hee;Cho, Woo-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.12
    • /
    • pp.2919-2925
    • /
    • 2014
  • Because of the rapid growth of mobile communication and wireless communication, Location-based services are handled in many applications. So, the management and analysis of spatio-temporal data are a hot issue in database research. Index structure and query processing of such contents are very important for these applications. This paper addressees algorithms that make index structure by using Douglas-Peucker Algorithm and process nearest neighbor search query efficiently on moving objects trajectories. We compare and analyze our algorithms by experiments. Our algorithms make small size of index structure and process the query more efficiently.