• Title/Summary/Keyword: k-nearest neighbor

Search Result 647, Processing Time 0.024 seconds

Personalized Expert-Based Recommendation (개인화된 전문가 그룹을 활용한 추천 시스템)

  • Chung, Yeounoh;Lee, Sungwoo;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.1
    • /
    • pp.7-11
    • /
    • 2013
  • Taking experts' knowledge to recommend items has shown some promising results in recommender system research. In order to improve the performance of the existing recommendation algorithms, previous researches on expert-based recommender systems have exploited the knowledge of a common expert group for all users. In this paper, we study a problem of identifying personalized experts within a user group, assuming each user needs different kinds and levels of expert help. To demonstrate this idea, we present a framework for using Support Vector Machine (SVM) to find varying expert groups for users; it is shown in an experiment that the proposed SVM approach can identify personalized experts, and that the person-alized expert-based collaborative filtering (CF) can yield better results than k-Nearest Neighbor (kNN) algorithm.

A Method of Highspeed Similarity Retrieval based on Self-Organizing Maps (자기 조직화 맵 기반 유사화상 검색의 고속화 수법)

  • Oh, Kun-Seok;Yang, Sung-Ki;Bae, Sang-Hyun;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.515-522
    • /
    • 2001
  • Feature-based similarity retrieval become an important research issue in image database systems. The features of image data are useful to discrimination of images. In this paper, we propose the highspeed k-Nearest Neighbor search algorithm based on Self-Organizing Maps. Self-Organizing Map(SOM) provides a mapping from high dimensional feature vectors onto a two-dimensional space. A topological feature map preserves the mutual relations (similarity) in feature spaces of input data, and clusters mutually similar feature vectors in a neighboring nodes. Each node of the topological feature map holds a node vector and similar images that is closest to each node vector. We implemented about k-NN search for similar image classification as to (1) access to topological feature map, and (2) apply to pruning strategy of high speed search. We experiment on the performance of our algorithm using color feature vectors extracted from images. Promising results have been obtained in experiments.

  • PDF

Reverse k-Nearest Neighbor Query Processing Method for Continuous Query Processing in Bigdata Environments (빅데이터 환경에서 연속 질의 처리를 위한 리버스 k-최근접 질의 처리 기법)

  • Lim, Jongtae;Park, Sunyong;Seo, Kiwon;Lee, Minho;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.454-462
    • /
    • 2014
  • With the development of location aware technologies and mobile devices, location-based services have been studied. To provide location-based services, many researchers proposed methods for processing various query types with Mapreduce(MR). One of the proposed methods, is a Reverse k-nearest neighbor(RkNN) query processing method with MR. However, the existing methods spend too much cost to process the continuous RkNN query. In this paper, we propose an efficient continuous RkNN query processing method with MR to resolve the problems of the existing methods. The proposed method uses the 60-degree-pruning method. The proposed method does not need to reprocess the query for continuous query processing because the proposed method draws and monitors the monitoring area including the candidate objects of a RkNN query. In order to show the superiority of the proposed method, we compare it with the query processing performance of the existing method.

Design of an Efficient Parallel High-Dimensional Index Structure (효율적인 병렬 고차원 색인구조 설계)

  • Park, Chun-Seo;Song, Seok-Il;Sin, Jae-Ryong;Yu, Jae-Su
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.58-71
    • /
    • 2002
  • Generally, multi-dimensional data such as image and spatial data require large amount of storage space. There is a limit to store and manage those large amount of data in single workstation. If we manage the data on parallel computing environment which is being actively researched these days, we can get highly improved performance. In this paper, we propose a parallel high-dimensional index structure that exploits the parallelism of the parallel computing environment. The proposed index structure is nP(processor)-n$\times$mD(disk) architecture which is the hybrid type of nP-nD and lP-nD. Its node structure increases fan-out and reduces the height of a index tree. Also, A range search algorithm that maximizes I/O parallelism is devised, and it is applied to K-nearest neighbor queries. Through various experiments, it is shown that the proposed method outperforms other parallel index structures.

Estimation of Aboveground Biomass Carbon Stock in Danyang Area using kNN Algorithm and Landsat TM Seasonal Satellite Images (kNN 알고리즘과 계절별 Landsat TM 위성영상을 이용한 단양군 지역의 지상부 바이오매스 탄소저장량 추정)

  • Jung, Jae-Hoon;Heo, Joon;Yoo, Su-Hong;Kim, Kyung-Min;Lee, Jung-Bin
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.18 no.4
    • /
    • pp.119-129
    • /
    • 2010
  • The joint use of remotely sensed data and field measurements has been widely used to estimate aboveground carbon stock in many countries. Recently, Korea Forest Research Institute has developed new carbon emission factors for kind of tree, thus more accurate estimate is possible. In this study, the aboveground carbon stock of Danyang area in South Korea was estimated using k-Nearest Neighbor(kNN) algorithm with the 5th National Forest Inventory(NFI) data. Considering the spectral response of forested area under the climate condition in Korea peninsular which has 4 distinct seasons, Landsat TM seasonal satellite images were collected. As a result, the estimated total carbon stock of Danyang area was ranged from 3542768.49tonC to 3329037.51tonC but seasonal trends were not found.

K Nearest Neighbor Joins for Big Data Processing based on Spark (Spark 기반 빅데이터 처리를 위한 K-최근접 이웃 연결)

  • JIAQI, JI;Chung, Yeongjee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1731-1737
    • /
    • 2017
  • K Nearest Neighbor Join (KNN Join) is a simple yet effective method in machine learning. It is widely used in small dataset of the past time. As the number of data increases, it is infeasible to run this model on an actual application by a single machine due to memory and time restrictions. Nowadays a popular batch process model called MapReduce which can run on a cluster with a large number of computers is widely used for large-scale data processing. Hadoop is a framework to implement MapReduce, but its performance can be further improved by a new framework named Spark. In the present study, we will provide a KNN Join implement based on Spark. With the advantage of its in-memory calculation capability, it will be faster and more effective than Hadoop. In our experiments, we study the influence of different factors on running time and demonstrate robustness and efficiency of our approach.

Development of Data Mining Algorithm for Implementation of Fine Dust Numerical Prediction Model (미세먼지 수치 예측 모델 구현을 위한 데이터마이닝 알고리즘 개발)

  • Cha, Jinwook;Kim, Jangyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.4
    • /
    • pp.595-601
    • /
    • 2018
  • Recently, as the fine dust level has risen rapidly, there is a great interest. Exposure to fine dust is associated with the development of respiratory and cardiovascular diseases and has been reported to increase death rate. In addition, there exist damage to fine dusts continues at industrial sites. However, exposure to fine dust is inevitable in modern life. Therefore, predicting and minimizing exposure to fine dust is the most efficient way to reduce health and industrial damages. Existing fine dust prediction model is estimated as good, normal, poor, and very bad, depending on the concentration range of the fine dust rather than the concentration value. In this paper, we study and implement to predict the PM10 level by applying the Artificial neural network algorithm and the K-Nearest Neighbor algorithm, which are machine learning algorithms, using the actual weather and air quality data.

Machine Learning Model for Predicting the Residual Useful Lifetime of the CNC Milling Insert (공작기계의 절삭용 인서트의 잔여 유효 수명 예측 모형)

  • Won-Gun Choi;Heungseob Kim;Bong Jin Ko
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.1
    • /
    • pp.111-118
    • /
    • 2023
  • For the implementation of a smart factory, it is necessary to collect data by connecting various sensors and devices in the manufacturing environment and to diagnose or predict failures in production facilities through data analysis. In this paper, to predict the residual useful lifetime of milling insert used for machining products in CNC machine, weight k-NN algorithm, Decision Tree, SVR, XGBoost, Random forest, 1D-CNN, and frequency spectrum based on vibration signal are investigated. As the results of the paper, the frequency spectrum does not provide a reliable criterion for an accurate prediction of the residual useful lifetime of an insert. And the weighted k-nearest neighbor algorithm performed best with an MAE of 0.0013, MSE of 0.004, and RMSE of 0.0192. This is an error of 0.001 seconds of the remaining useful lifetime of the insert predicted by the weighted-nearest neighbor algorithm, and it is considered to be a level that can be applied to actual industrial sites.

Metalevel Data Mining through Multiple Classifier Fusion (다수 분류기를 이용한 메타레벨 데이터마이닝)

  • 김형관;신성우
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.551-553
    • /
    • 1999
  • This paper explores the utility of a new classifier fusion approach to discrimination. Multiple classifier fusion, a popular approach in the field of pattern recognition, uses estimates of each individual classifier's local accuracy on training data sets. In this paper we investigate the effectiveness of fusion methods compared to individual algorithms, including the artificial neural network and k-nearest neighbor techniques. Moreover, we propose an efficient meta-classifier architecture based on an approximation of the posterior Bayes probabilities for learning the oracle.

  • PDF

A Nonparametric Procedure for Bioassay by using Conditional Quantile Processes

  • Kim, Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.3
    • /
    • pp.179-186
    • /
    • 1996
  • Bioequivanence models arise typically in bioassays when new preparations are compared against standard ones by means of responses on some biological organisms. Relative potency measures provide nice interpretations for such bioequivalence and their estimation constitutes the prime interest of such studies. A conditional quantile process based on the k-nearest neighbor method is proposed for this purpose. An alternative procedure based on Kolmogrov-Smirnov type estimator has also been considered along with. ARIC ultrasound data are analyzed as examples.

  • PDF