• Title/Summary/Keyword: k-NN Method

Search Result 307, Processing Time 0.028 seconds

Short-term Traffic States Prediction Using k-Nearest Neighbor Algorithm: Focused on Urban Expressway in Seoul (k-NN 알고리즘을 활용한 단기 교통상황 예측: 서울시 도시고속도로 사례)

  • KIM, Hyungjoo;PARK, Shin Hyoung;JANG, Kitae
    • Journal of Korean Society of Transportation
    • /
    • v.34 no.2
    • /
    • pp.158-167
    • /
    • 2016
  • This study evaluates potential sources of errors in k-NN(k-nearest neighbor) algorithm such as procedures, variables, and input data. Previous research has been thoroughly reviewed for understanding fundamentals of k-NN algorithm that has been widely used for short-term traffic states prediction. The framework of this algorithm commonly includes historical data smoothing, pattern database, similarity measure, k-value, and prediction horizon. The outcomes of this study suggests that: i) historical data smoothing is recommended to reduce random noise of measured traffic data; ii) the historical database should contain traffic state information on both normal and event conditions; and iii) trial and error method can improve the prediction accuracy by better searching for the optimum input time series and k-value. The study results also demonstrates that predicted error increases with the duration of prediction horizon and rapidly changing traffic states.

Implementation of DTW-kNN-based Decision Support System for Discriminating Emerging Technologies (DTW-kNN 기반의 유망 기술 식별을 위한 의사결정 지원 시스템 구현 방안)

  • Jeong, Do-Heon;Park, Ju-Yeon
    • Journal of Industrial Convergence
    • /
    • v.20 no.8
    • /
    • pp.77-84
    • /
    • 2022
  • This study aims to present a method for implementing a decision support system that can be used for selecting emerging technologies by applying a machine learning-based automatic classification technique. To conduct the research, the architecture of the entire system was built and detailed research steps were conducted. First, emerging technology candidate items were selected and trend data was automatically generated using a big data system. After defining the conceptual model and pattern classification structure of technological development, an efficient machine learning method was presented through an automatic classification experiment. Finally, the analysis results of the system were interpreted and methods for utilization were derived. In a DTW-kNN-based classification experiment that combines the Dynamic Time Warping(DTW) method and the k-Nearest Neighbors(kNN) classification model proposed in this study, the identification performance was up to 87.7%, and particularly in the 'eventual' section where the trend highly fluctuates, the maximum performance difference was 39.4% points compared to the Euclidean Distance(ED) algorithm. In addition, through the analysis results presented by the system, it was confirmed that this decision support system can be effectively utilized in the process of automatically classifying and filtering by type with a large amount of trend data.

Electrocardiographic characteristics of significant factors of detected atrial fibrillation using WEMS

  • Kim, Min Soo;Kim, Yoon Nyun;Cho, Young Chang
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.6
    • /
    • pp.37-46
    • /
    • 2015
  • The wireless electrocardiographic monitoring system(WDMS) is designed to be long term monitoring for the early detection of cardiac disorders. The current version of the WDMS can identify two types of cardiac rhythms in real-time, such as atrial fibrillation(AF) and normal sinus rhythm(NSR), which are very important to track cardiac-rhythm disorders. In this study, we proposed the analysis method to discriminate the characteristics statistically evaluated in both time and frequency domains between AF and NSR using various parameters in the heart rate variability(HRV). And we applied various ECG detection methods (e.g., difference operation method) and compared the results with those of the discrete wavelet transform(DWT) method. From the statistically results, we found that the parameters such as STD RR, STD HR, RMSSD, NN50, pNN50, RR Trian, and TNN(p<0.05) are significantly different between the AF and NSR patients in time domain. On the other hand, the frequency domain analysis results showed a significant difference in VLF power($ms^2$), LF power($ms^2$), HF power($ms^2$), VLF(%), LF(%), and HF(%). In particular, the parameters such as STD RR, RMSSD, NN50, pNN50, VLF power, LF power and HF power were considered as the most useful parameters in both AF and NSR patient groups. Our proposed method can be efficiently applied to early detection of abnormal conditions and prevent the such abnormals from becoming serious.

A K-Nearest Neighbor Algorithm for Categorical Sequence Data (범주형 시퀀스 데이터의 K-Nearest Neighbor알고리즘)

  • Oh Seung-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.2 s.34
    • /
    • pp.215-221
    • /
    • 2005
  • TRecently, there has been enormous growth in the amount of commercial and scientific data, such as protein sequences, retail transactions, and web-logs. Such datasets consist of sequence data that have an inherent sequential nature. In this Paper, we study how to classify these sequence datasets. There are several kinds techniques for data classification such as decision tree induction, Bayesian classification and K-NN etc. In our approach, we use a K-NN algorithm for classifying sequences. In addition, we propose a new similarity measure to compute the similarity between two sequences and an efficient method for measuring similarity.

  • PDF

Monitoring Continuous k-Nearest Neighbor Queries, using c-MBR

  • Jung Ha-Rim;Kang Sang-Won;Song Moon-Bae;Im Seok-Jin;Kim Jong-Wan;Hwang Chong-Sun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06c
    • /
    • pp.46-48
    • /
    • 2006
  • This paper addresses the problem of monitoring continuous k-nearest neighbor (k-NN) queries. Given a set of moving (or static) objects and a set of moving (or static) query points, monitoring continuous k-NN query retrieves and updates the closest k objects to a query point continually. In order to support location based services (LBSs) in highly dynamic environments, where objects and/or queries are frequently moving, monitoring continuous queries require real-time updated results when objects and/or queries change their locations. Thus, it is important to minimize time delay for maintaining up to date the results. In this paper, we present monitoring method to shorten time delay for updating continuous k-NN queries based on the notion of result region and the minimum bounding rectangle enclosing all objects in each cell, referred to as c-MBR, in the grid index structure. Simulations are conducted to show the efficiency of the proposed method.

  • PDF

OHC Algorithm for RPA Memory Based Reasoning (RPA분류기의 성능 향상을 위한 OHC알고리즘)

  • 이형일
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.5
    • /
    • pp.824-830
    • /
    • 2003
  • RPA (Recursive Partition Averaging) method was proposed in order to improve the storage requirement and classification rate of the Memory Based Reasoning. That algorithm worked well in many areas, however, the major drawbacks of RPA are it's pattern averaging mechanism. We propose an adaptive OHC algorithm which uses the FPD(Feature-based Population Densimeter) to increase the classification rate of RPA. The proposed algorithm required only approximately 40% of memory space that is needed in k-NN classifier, and showed a superior classification performance to the RPA. Also, by reducing the number of stored patterns, it showed a excellent results in terms of classification when we compare it to the k-NN.

  • PDF

Efficient k-Nearest Neighbor Query Processing Method for a Large Location Data (대용량 위치 데이터에서 효율적인 k-최근접 질의 처리 기법)

  • Choi, Dojin;Lim, Jongtae;Yoo, Seunghun;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.8
    • /
    • pp.619-630
    • /
    • 2017
  • With the growing popularity of smart devices, various location based services have been providing to users. Recently, some location based social applications that combine social services and location based services have been emerged. The demands of a k-nearest neighbors(k-NN) query which finds k closest locations from a user location are increased in the location based social network services. In this paper, we propose an approximate k-NN query processing method for fast response time in a large number of users environments. The proposed method performs efficient stream processing using big data distributed processing technologies. In this paper, we also propose a modified grid index method for indexing a large amount of location data. The proposed query processing method first retrieves the related cells by considering a user movement. By doing so, it can make an approximate k results set. In order to show the superiority of the proposed method, we conduct various performance evaluations with the existing method.

Assessment of Forest Biomass using k-Neighbor Techniques - A Case Study in the Research Forest at Kangwon National University - (k-NN기법을 이용한 산림바이오매스 자원량 평가 - 강원대학교 학술림을 대상으로 -)

  • Seo, Hwanseok;Park, Donghwan;Yim, Jongsu;Lee, Jungsoo
    • Journal of Korean Society of Forest Science
    • /
    • v.101 no.4
    • /
    • pp.547-557
    • /
    • 2012
  • This study purposed to estimate the forest biomass using k-Nearest Neighbor (k-NN) algorithm. Multiple data sources were used for the analysis such as forest type map, field survey data and Landsat TM data. The accuracy of forest biomass was evaluated with the forest stratification, horizontal reference area (HRA) and spatial filtering. Forests were divided into 3 types such as conifers, broadleaved, and Korean pine (Pinus koriansis) forests. The applied radii of HRA were 4 km, 5 km and 10 km, respectively. The estimated biomass and mean bias for conifers forest was 222 t/ha and 1.8 t/ha when the value of k=8, the radius of HRA was 4 km, and $5{\times}5$ modal was filtered. The estimated forest biomass of Korean pine was 245 t/ha when the value of k=8, the radius of HRA was 4km. The estimated mean biomass and mean bias for broadleaved forests were 251 t/ha and -1.6 t/ha, respectively, when the value of k=6, the radius of HRA was 10 km. The estimated total forest biomass by k-NN method was 799,000t and 237 t/ha. The estimated mean biomass by ${\kappa}NN$method was about 1t/ha more than that of filed survey data.

Outlier Analysis of Learner's Learning Behaviors Data using k-NN Method (k-NN 기법을 이용한 학습자의 학습 행위 데이터의 이상치 분석)

  • Yoon, Tae-Bok;Jung, Young-Mo;Lee, Jee-Hyong;Cha, Hyun-Jin;Park, Seon-Hee;Kim, Yong-Se
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.524-529
    • /
    • 2007
  • 지능형 학습 시스템은 학습자의 학습 과정에서 수집된 데이터를 분석하여 학습자에게 맞는 전략을 세우고 적합한 서비스를 제공하는 시스템이다. 학습자에게 적합한 서비스를 위해서는 학습자 모델링 작업이 우선시 되며, 이 모델 생성을 위해서 학습자의 학습 과정에서 발생한 데이터를 수집하고 분석하게 된다. 하지만, 수집된 데이터가 학습자의 일관되지 못한 행위나 비예측 학습 성향을 포함하고 있다면, 생성된 모델을 신뢰하기 어렵다. 본 논문에서는 학습자에게서 수집된 데이터를 거리기반 이상치 선별 방법인 k-NN을 이용하여 이상치를 선별한다. 실험에서는 홈 인테리어 컨텐츠 기반에 학습자의 학습 행위에 대한 학습 성향을 진단하기 위한 DOLLS-HI를 이용하여, 수집된 학습자의 데이터에서 이상치를 분류하고 학습 성향 진단을 위한 모델을 생성하였다. 생성된 모델은 이상치 분류전과 비교하여 신뢰가 향상된 것을 확인하였다.

  • PDF

Electronic Commerce Agent using Multi-Estimation Method (다중추정방법에 의한 전자상거래 에이전트)

  • 김우정;이수원
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.310-312
    • /
    • 2000
  • 추정을 위한 방법으로는 K-NN과 회귀분석, 신경망 등의 다양한 방법을 적용할 수 있다. 그러나 K-NN의 경우 거리에 의해서만 결과를 추정하므로 각 속성에 대한 가중치가 속성 값들의 간격에 의해 결정되고, 회귀분석은 하나의 선으로 데이터의 경향을 표현하므로 속성의 가중치는 고려되지만, 데이터의 분포가 넓을 경우에는 많은 오차를 포함하게 되는 데이터에 의존적인 문제가 존재한다. 따라서 본 연구에서는 이러한 방법들을 혼합하여 데이터에 의존적인 문제를 보안할 수 있는 다중분석방법을 제안한다.

  • PDF