• Title/Summary/Keyword: k-최근접 이웃 알고리즘

Search Result 82, Processing Time 0.025 seconds

A Batch Processing Algorithm for Moving k-Nearest Neighbor Queries in Dynamic Spatial Networks

  • Cho, Hyung-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.63-74
    • /
    • 2021
  • Location-based services (LBSs) are expected to process a large number of spatial queries, such as shortest path and k-nearest neighbor queries that arrive simultaneously at peak periods. Deploying more LBS servers to process these simultaneous spatial queries is a potential solution. However, this significantly increases service operating costs. Recently, batch processing solutions have been proposed to process a set of queries using shareable computation. In this study, we investigate the problem of batch processing moving k-nearest neighbor (MkNN) queries in dynamic spatial networks, where the travel time of each road segment changes frequently based on the traffic conditions. LBS servers based on one-query-at-a-time processing often fail to process simultaneous MkNN queries because of the significant number of redundant computations. We aim to improve the efficiency algorithmically by processing MkNN queries in batches and reusing sharable computations. Extensive evaluation using real-world roadmaps shows the superiority of our solution compared with state-of-the-art methods.

Efficient Path Finding Based on the $A^*$ algorithm for Processing k-Nearest Neighbor Queries in Road Network Databases (도로 네트워크에서 $A^*$ 알고리즘을 이용한 k-최근접 이웃 객체에 대한 효과적인 경로 탐색 방법)

  • Shin, Sung-Hyun;Lee, Sang-Chul;Kim, Sang-Wook;Lee, Jung-Hoon;Im, Eul-Kyu
    • Journal of KIISE:Databases
    • /
    • v.36 no.5
    • /
    • pp.405-410
    • /
    • 2009
  • This paper proposes an efficient path finding scheme capable of searching the paths to k static objects from a given query point, aiming at both improving the legacy k-nearest neighbor search and making it easily applicable to the road network environment. To the end of improving the speed of finding one-to-many paths, the modified A* obviates the duplicated part of node scans involved in the multiple executions of a one-to-one path finding algorithm. Additionally, the cost to the each object found in this step makes it possible to finalize the k objects according to the network distance from the candidate set as well as to order them by the path cost. Experiment results show that the proposed scheme has the accuracy of around 100% and improves the search speed by $1.3{\sim}3.0$ times of k-nearest neighbor searches, compared with INE, post-Dijkstra, and $na{\ddot{i}}ve$ method.

K Nearest Neighbor Joins for Big Data Processing based on Spark (Spark 기반 빅데이터 처리를 위한 K-최근접 이웃 연결)

  • JIAQI, JI;Chung, Yeongjee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1731-1737
    • /
    • 2017
  • K Nearest Neighbor Join (KNN Join) is a simple yet effective method in machine learning. It is widely used in small dataset of the past time. As the number of data increases, it is infeasible to run this model on an actual application by a single machine due to memory and time restrictions. Nowadays a popular batch process model called MapReduce which can run on a cluster with a large number of computers is widely used for large-scale data processing. Hadoop is a framework to implement MapReduce, but its performance can be further improved by a new framework named Spark. In the present study, we will provide a KNN Join implement based on Spark. With the advantage of its in-memory calculation capability, it will be faster and more effective than Hadoop. In our experiments, we study the influence of different factors on running time and demonstrate robustness and efficiency of our approach.

Location Estimation Method Employing Fingerprinting Scheme based on K-Nearest Neighbor Algorithm under WLAN Environment of Ship (선박의 WLAN 환경에서 K-최근접 이웃 알고리즘 기반 Fingerprinting 방식을 적용한 위치 추정 방법)

  • Kim, Beom-Mu;Jeong, Min A;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2530-2536
    • /
    • 2014
  • Many studies have been made on location estimation under indoor environments which GPS signals do not reach, and, as a result, a variety of estimation methods have been proposed. In this paper, we deeply consider a problem of location estimation in a ship with a multi-story structure, and investigate a location estimation method using the fingerprint scheme based on the K-Nearest Neighbor algorithm. A reliable DB is constructed by measuring 100 received signals at each of 39 RPs in order to employ the fingerprint scheme, and, based on the DB, a simulation to estimate the location of a randomly-positioned terminal is performed. The simulation result confirms that the performance of location estimation by the fingerprint scheme is quite satisfactory.

Malware Classification System to Support Decision Making of App Installation on Android OS (안드로이드 OS에서 앱 설치 의사결정 지원을 위한 악성 앱 분류 시스템)

  • Ryu, Hong Ryeol;Jang, Yun;Kwon, Taekyoung
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1611-1622
    • /
    • 2015
  • Although Android systems provide a permission-based access control mechanism and demand a user to decide whether to install an app based on its permission list, many users tend to ignore this phase. Thus, an improved method is necessary for users to intuitively make informed decisions when installing a new app. In this paper, with regard to the permission-based access control system, we present a novel approach based on a machine-learning technique in order to support a user decision-making on the fly. We apply the K-NN (K-Nearest Neighbors) classification algorithm with necessary weighted modifications for malicious app classification, and use 152 Android permissions as features. Our experiment shows a superior classification result (93.5% accuracy) compared to other previous work. We expect that our method can help users make informed decisions at the installation step.

Prototype based Classification by Generating Multidimensional Spheres per Class Area (클래스 영역의 다차원 구 생성에 의한 프로토타입 기반 분류)

  • Shim, Seyong;Hwang, Doosung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.21-28
    • /
    • 2015
  • In this paper, we propose a prototype-based classification learning by using the nearest-neighbor rule. The nearest-neighbor is applied to segment the class area of all the training data into spheres within which the data exist from the same class. Prototypes are the center of spheres and their radii are computed by the mid-point of the two distances to the farthest same class point and the nearest another class point. And we transform the prototype selection problem into a set covering problem in order to determine the smallest set of prototypes that include all the training data. The proposed prototype selection method is based on a greedy algorithm that is applicable to the training data per class. The complexity of the proposed method is not complicated and the possibility of its parallel implementation is high. The prototype-based classification learning takes up the set of prototypes and predicts the class of test data by the nearest neighbor rule. In experiments, the generalization performance of our prototype classifier is superior to those of the nearest neighbor, Bayes classifier, and another prototype classifier.

k-Nearest Neighbor Querv Processing using Approximate Indexing in Road Network Databases (도로 네트워크 데이타베이스에서 근사 색인을 이용한 k-최근접 질의 처리)

  • Lee, Sang-Chul;Kim, Sang-Wook
    • Journal of KIISE:Databases
    • /
    • v.35 no.5
    • /
    • pp.447-458
    • /
    • 2008
  • In this paper, we address an efficient processing scheme for k-nearest neighbor queries to retrieve k static objects in road network databases. Existing methods cannot expect a query processing speed-up by index structures in road network databases, since it is impossible to build an index by the network distance, which cannot meet the triangular inequality requirement, essential for index creation, but only possible in a totally ordered set. Thus, these previous methods suffer from a serious performance degradation in query processing. Another method using pre-computed network distances also suffers from a serious storage overhead to maintain a huge amount of pre-computed network distances. To solve these performance and storage problems at the same time, this paper proposes a novel approach that creates an index for moving objects by approximating their network distances and efficiently processes k-nearest neighbor queries by means of the approximate index. For this approach, we proposed a systematic way of mapping each moving object on a road network into the corresponding absolute position in the m-dimensional space. To meet the triangular inequality this paper proposes a new notion of average network distance, and uses FastMap to map moving objects to their corresponding points in the m-dimensional space. After then, we present an approximate indexing algorithm to build an R*-tree, a multidimensional index, on the m-dimensional points of moving objects. The proposed scheme presents a query processing algorithm capable of efficiently evaluating k-nearest neighbor queries by finding k-nearest points (i.e., k-nearest moving objects) from the m-dimensional index. Finally, a variety of extensive experiments verifies the performance enhancement of the proposed approach by performing especially for the real-life road network databases.

k-Nearest Neighbor Learning with Varying Norms (놈(Norm)에 따른 k-최근접 이웃 학습의 성능 변화)

  • Kim, Doo-Hyeok;Kim, Chan-Ju;Hwang, Kyu-Baek
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.371-375
    • /
    • 2008
  • 예제 기반 학습(instance-based learning) 방법 중 하나인 k-최근접 이웃(k-nearest reighbor, k-NN) 학습은 간단하고 예측 정확도가 비교적 높아 분류 및 회귀 문제 해결을 위한 기반 방법론으로 널리 적용되고 있다. k-NN 학습을 위한 알고리즘은 기본적으로 유클리드 거리 혹은 2-놈(norm)에 기반하여 학습예제들 사이의 거리를 계산한다. 본 논문에서는 유클리드 거리를 일반화한 개념인 p-놈의 사용이 k-NN 학습의 성능에 어떠한 영향을 미치는지 연구하였다. 구체적으로 합성데이터와 다수의 기계학습 벤치마크 문제 및 실제 데이터에 다양한 p-놈을 적용하여 그 일반화 성능을 경험적으로 조사하였다. 실험 결과, 데이터에 잡음이 많이 존재하거나 문제가 어려운 경우에 p의 값을 작게 하는 것이 성능을 향상시킬 수 있었다.

  • PDF

Machine Learning Model for Predicting the Residual Useful Lifetime of the CNC Milling Insert (공작기계의 절삭용 인서트의 잔여 유효 수명 예측 모형)

  • Won-Gun Choi;Heungseob Kim;Bong Jin Ko
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.1
    • /
    • pp.111-118
    • /
    • 2023
  • For the implementation of a smart factory, it is necessary to collect data by connecting various sensors and devices in the manufacturing environment and to diagnose or predict failures in production facilities through data analysis. In this paper, to predict the residual useful lifetime of milling insert used for machining products in CNC machine, weight k-NN algorithm, Decision Tree, SVR, XGBoost, Random forest, 1D-CNN, and frequency spectrum based on vibration signal are investigated. As the results of the paper, the frequency spectrum does not provide a reliable criterion for an accurate prediction of the residual useful lifetime of an insert. And the weighted k-nearest neighbor algorithm performed best with an MAE of 0.0013, MSE of 0.004, and RMSE of 0.0192. This is an error of 0.001 seconds of the remaining useful lifetime of the insert predicted by the weighted-nearest neighbor algorithm, and it is considered to be a level that can be applied to actual industrial sites.

A Movie Recommender Systems using Personal Disposition in Hadoop (하둡에서 개인 성향을 이용한 영화 추천시스템)

  • Kim, Sun-Ho;Kim, Se-Jun;Mo, Ha-Young;Kim, Chae-Reen;Park, Gyu-Tae;Park, Doo-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.642-644
    • /
    • 2014
  • 정보의 폭발적인 증가로 인해 사용자들은 오히려 원하는 정보를 빠른 시간에 얻는 것이 힘들어졌다. 따라서 이 문제를 해결하기 위한 다양한 방식의 새로운 서비스들이 제공되고 있다. 추천 시스템 중에서 영화를 추천해주는 방법에는 사용되는 알고리즘에는 협업필터링 방법이 가장 성공한 알고리즘으로 사용되고 있다. 협업 필터링 방법은 사용자가 자발적으로 입력한 선호도 평가치를 바탕으로 추천 하고자 하는 사용자와 취향이 비슷하다고 판단되는 사람들 즉, 최근접 이웃을 구하고 최근접 이웃의 선호도 평가치를 바탕으로 사용자에게 영화를 추천을 해주는 기법이다. 그러나 협업 필터링에는 몇 가지 대표적인 문제점이 있으며 희박성 및 확장성, 투명성이 있다. 본 논문에서는 영화 추천 시스템에서의 협업필터링의 희박성 문제를 보완하고자 개개인의 성향을 반영하여 효율이 좋은 추천 방법을 제안하고 하둡에서 성능평가를 하였다.