• 제목/요약/키워드: k nearest neighbor approach

검색결과 95건 처리시간 0.027초

A Stochastic Approach for Prediction of Partially Measured Concentrations of Benzo[a]pyrene in the Ambient Air in Korea

  • Kim, Yongku;Seo, Young-Kyo;Baek, Kyung-Min;Kim, Min-Ji;Baek, Sung-Ok
    • Asian Journal of Atmospheric Environment
    • /
    • 제10권4호
    • /
    • pp.197-207
    • /
    • 2016
  • Large quantities of air pollutants are released into the atmosphere and hence, must be monitored and routinely assessed for their health implications. This paper proposes a stochastic technique to predict unobserved hazardous air pollutants (HAPs), especially Benzo[a]pyrene (BaP), which can have negative effects on human health. The proposed approach constructs a nearest-neighbor structure by incorporating the linkage between BaP and meteorology and meteorological effects. This approach is adopted in order to predict unobserved BaP concentrations based on observed (or forecasted) meteorological conditions, including temperature, precipitation, wind speed, and air quality. The effects of BaP on human health are examined by characterizing the cancer risk. The efficient prediction provides useful information relating to the optimal monitoring period and projections of future BaP concentrations for both industrial and residential areas within Korea.

기어의 이상검지 및 진단에 관한 연구 -Wavelet Transform해석과 KDI의 비교- (A Study on Fault Detection and Diagnosis of Gear Damages - A Comparison between Wavelet Transform Analysis and Kullback Discrimination Information -)

  • 김태구;김광일
    • 한국안전학회지
    • /
    • 제15권2호
    • /
    • pp.1-7
    • /
    • 2000
  • This paper presents the approach involving fault detection and diagnosis of gears using pattern recognition and Wavelet transform. It describes result of the comparison between KDI (Kullback Discrimination Information) with the nearest neighbor classification rule as one of pattern recognition methods and Wavelet transform to know a way to detect and diagnosis of gear damages experimentally. To model the damages 1) Normal (no defect), 2) one tooth is worn out, 3) All teeth faces are worn out 4) One tooth is broken. The vibration sensor was attached on the bearing housing. This produced the total time history data that is 20 pieces of each condition. We chose the standard data and measure distance between standard and tested data. In Wavelet transform analysis method, the time series data of magnitude in specified frequency (rotary and mesh frequency) were earned. As a result, the monitoring system using Wavelet transform method and KDI with nearest neighbor classification rule successfully detected and classified the damages from the experimental data.

  • PDF

범주형 시퀀스 데이터의 K-Nearest Neighbor알고리즘 (A K-Nearest Neighbor Algorithm for Categorical Sequence Data)

  • 오승준
    • 한국컴퓨터정보학회논문지
    • /
    • 제10권2호
    • /
    • pp.215-221
    • /
    • 2005
  • 최근에는 단백질 시퀀스, 소매점 거래 데이터, 웹 로그 등과 같은 상업적이거나 과학적인 데이터의 폭발적인 증가를 볼 수 있다. 이런 데이터들은 순서적인 면을 가지고 있는 시퀀스 데이터들이다. 본 논문에서는 이런 시퀀스 데이터들을 분류하는 문제를 다룬다. 분류 기법 으로는 의사결정 나무나 베이지안 분류기, K-NN방법 등 석러 종류가 있는데, 본 연구에서는 또-U방법을 이용하여 시퀀스들을 분류한다. 또한, 시퀀스들간의 유사도를 구하기 위한 새로운 계산 방법과 효율적인 계산 방법도 제안한다.

  • PDF

사각형 특징 기반 분류기와 AdaBoost 를 이용한 실시간 얼굴 검출 및 인식 (Real-time Face Detection and Recognition using Classifier Based on Rectangular Feature and AdaBoost)

  • 김종민;이웅기
    • 통합자연과학논문집
    • /
    • 제1권2호
    • /
    • pp.133-139
    • /
    • 2008
  • Face recognition technologies using PCA(principal component analysis) recognize faces by deciding representative features of faces in the model image, extracting feature vectors from faces in a image and measuring the distance between them and face representation. Given frequent recognition problems associated with the use of point-to-point distance approach, this study adopted the K-nearest neighbor technique(class-to-class) in which a group of face models of the same class is used as recognition unit for the images inputted on a continual input image. This paper proposes a new PCA recognition in which database of faces.

  • PDF

드론 배달 경로를 위한 효율적인 휴리스틱 알고리즘 (Efficient Heuristic Algorithms for Drone Package Delivery Route)

  • 요나탄;테메스겐;김재훈
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2016년도 춘계학술발표대회
    • /
    • pp.168-170
    • /
    • 2016
  • Drone package delivery routing problem is realistic problem used to find efficient route of drone package delivery service. In this paper, we present an approach for solving drone routing problem for package delivery service using two different heuristics algorithms, genetic and nearest neighbor. We implement and analyze both heuristics algorithms for solving the problem efficiently with respect to cost and time. The respective experimental results show that for the range of customers 10 to 50 nearest neighbor and genetic algorithms can reduce the tour length on average by 34% and 40% respectively comparing to FIFO algorithm.

Guitar Tab Digit Recognition and Play using Prototype based Classification

  • Baek, Byung-Hyun;Lee, Hyun-Jong;Hwang, Doosung
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권9호
    • /
    • pp.19-25
    • /
    • 2016
  • This paper is to recognize and play tab chords from guitar musical sheets. The musical chord area of an input image is segmented by changing the image in saturation and applying the Grabcut algorithm. Based on a template matching, our approach detects tab starting sections on a segmented musical area. The virtual block method is introduced to search blanks over chord lines and extract tab fret segments, which doesn't cause the computation loss to remove tab lines. In the experimental tests, the prototype based classification outperforms Bayesian method and the nearest neighbor rule with the whole set of training data and its performance is similar to that of the support vector machine. The experimental result shows that the prediction rate is about 99.0% and the number of selected prototypes is below 3.0%.

머신러닝을 활용한 모돈의 생산성 예측모델 (Forecasting Sow's Productivity using the Machine Learning Models)

  • 이민수;최영찬
    • 농촌지도와개발
    • /
    • 제16권4호
    • /
    • pp.939-965
    • /
    • 2009
  • The Machine Learning has been identified as a promising approach to knowledge-based system development. This study aims to examine the ability of machine learning techniques for farmer's decision making and to develop the reference model for using pig farm data. We compared five machine learning techniques: logistic regression, decision tree, artificial neural network, k-nearest neighbor, and ensemble. All models are well performed to predict the sow's productivity in all parity, showing over 87.6% predictability. The model predictability of total litter size are highest at 91.3% in third parity and decreasing as parity increases. The ensemble is well performed to predict the sow's productivity. The neural network and logistic regression is excellent classifier for all parity. The decision tree and the k-nearest neighbor was not good classifier for all parity. Performance of models varies over models used, showing up to 104% difference in lift values. Artificial Neural network and ensemble models have resulted in highest lift values implying best performance among models.

  • PDF

Plurality Rule-based Density and Correlation Coefficient-based Clustering for K-NN

  • Aung, Swe Swe;Nagayama, Itaru;Tamaki, Shiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제6권3호
    • /
    • pp.183-192
    • /
    • 2017
  • k-nearest neighbor (K-NN) is a well-known classification algorithm, being feature space-based on nearest-neighbor training examples in machine learning. However, K-NN, as we know, is a lazy learning method. Therefore, if a K-NN-based system very much depends on a huge amount of history data to achieve an accurate prediction result for a particular task, it gradually faces a processing-time performance-degradation problem. We have noticed that many researchers usually contemplate only classification accuracy. But estimation speed also plays an essential role in real-time prediction systems. To compensate for this weakness, this paper proposes correlation coefficient-based clustering (CCC) aimed at upgrading the performance of K-NN by leveraging processing-time speed and plurality rule-based density (PRD) to improve estimation accuracy. For experiments, we used real datasets (on breast cancer, breast tissue, heart, and the iris) from the University of California, Irvine (UCI) machine learning repository. Moreover, real traffic data collected from Ojana Junction, Route 58, Okinawa, Japan, was also utilized to lay bare the efficiency of this method. By using these datasets, we proved better processing-time performance with the new approach by comparing it with classical K-NN. Besides, via experiments on real-world datasets, we compared the prediction accuracy of our approach with density peaks clustering based on K-NN and principal component analysis (DPC-KNN-PCA).

비모수 추정방법을 활용한 kNNDD의 이상치 탐지 기법 (kNNDD-based One-Class Classification by Nonparametric Density Estimation)

  • 손정환;김성범
    • 대한산업공학회지
    • /
    • 제38권3호
    • /
    • pp.191-197
    • /
    • 2012
  • One-class classification (OCC) is one of the recent growing areas in data mining and pattern recognition. In the present study we examine a k-nearest neighbors data description (kNNDD) algorithm, one of the OCC algorithms widely used. In particular, we propose to use nonparametric estimation methods to determine the threshold of the kNNDD algorithm. A simulation study has been conducted to explore the characteristics of the proposed approach and compare it with the existing approach that determines the threshold. The results demonstrate the usefulness and flexibility of the proposed approach.

부도예측 개선을 위한 하이브리드 언더샘플링 접근법 (A Hybrid Under-sampling Approach for Better Bankruptcy Prediction)

  • 김태훈;안현철
    • 지능정보연구
    • /
    • 제21권2호
    • /
    • pp.173-190
    • /
    • 2015
  • 부도는 막대한 사회적, 경제적 손실을 야기할 수 있으므로, 미리 부도여부를 정확하게 예측하여 선제 대응하는 것은 경영분야에서 대단히 중요한 의사결정문제 중 하나이다. 이에 지능정보시스템 분야에서도 그간 기업의 재무 데이터에 기반해 부도예측을 개선하기 위한 노력을 기울여왔는데, 안타깝게도 기존의 연구들은 대부분 분류모형의 성능 개선을 통해 예측 정확도를 개선하는 것에만 주로 초점을 맞추어 다른 요소들을 충분히 고려하지 못했다는 한계가 있다. 이러한 배경에서 본 연구는 부도예측 모형의 정확도를 개선하기 위한 방편으로 새로운 데이터 전처리 방법, 그 중에서도 효과적인 표본추출 방법을 제안하고자 한다. 일반적으로 부도예측을 위해 사용되는 데이터들은 극심한 데이터 불균형 문제에 노출되어 있는데, 본 연구에서는 k-reverse nearest neighbor(k-RNN)와 one-class support vector machine(OCSVM) 방법을 결합한 하이브리드 언더샘플링(hybrid under-sampling) 접근법을 통해 이같은 데이터 불균형 문제를 해결하고자 하였다. 본 연구에서 제안한 접근법에서 k-RNN은 이상치를 효과적으로 제거할 수 있으며, OCSVM은 다수를 구성하는 등급의 데이터로부터 정보량이 풍부한 표본만 효과적으로 선택할 수 있는 수단으로 활용될 수 있다. 제안된 기법의 성능을 검증하기 위해, 본 연구에서는 국내 한 은행의 비외감기업 부도예측모형 구축에 제안 기법을 적용해 본 뒤, 일반적으로 많이 사용되는 랜덤샘플링(random sampling)과 제안 기법의 성능을 비교해 보았다. 그 결과, 로지스틱 회귀분석, 판별분석, 의사결정나무, SVM 등 대다수의 분류모형에 있어 분류 정확도가 개선됨을 확인할 수 있었으며, 모든 분류모형에 있어 부정 오류, 즉 부실기업을 정상으로 예측하는 오류율이 크게 감소함을 확인할 수 있었다.