• Title/Summary/Keyword: k-nearest neighbor smoothing

Search Result 7, Processing Time 0.019 seconds

The Rank Transform Method in Nonparametric Fuzzy Regression Model

  • Choi, Seung-Hoe;Lee, Myung-Sook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.3
    • /
    • pp.617-624
    • /
    • 2004
  • In this article the fuzzy number rank and the fuzzy rank transformation method are introduced in order to analyse the non-parametric fuzzy regression model which cannot be described as a specific functional form such as the crisp data and fuzzy data as a independent and dependent variables respectively. The effectiveness of fuzzy rank transformation methods is compared with other methods through the numerical examples.

  • PDF

Short-term Traffic States Prediction Using k-Nearest Neighbor Algorithm: Focused on Urban Expressway in Seoul (k-NN 알고리즘을 활용한 단기 교통상황 예측: 서울시 도시고속도로 사례)

  • KIM, Hyungjoo;PARK, Shin Hyoung;JANG, Kitae
    • Journal of Korean Society of Transportation
    • /
    • v.34 no.2
    • /
    • pp.158-167
    • /
    • 2016
  • This study evaluates potential sources of errors in k-NN(k-nearest neighbor) algorithm such as procedures, variables, and input data. Previous research has been thoroughly reviewed for understanding fundamentals of k-NN algorithm that has been widely used for short-term traffic states prediction. The framework of this algorithm commonly includes historical data smoothing, pattern database, similarity measure, k-value, and prediction horizon. The outcomes of this study suggests that: i) historical data smoothing is recommended to reduce random noise of measured traffic data; ii) the historical database should contain traffic state information on both normal and event conditions; and iii) trial and error method can improve the prediction accuracy by better searching for the optimum input time series and k-value. The study results also demonstrates that predicted error increases with the duration of prediction horizon and rapidly changing traffic states.

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.

Intensity and Ambient Enhanced Lidar-Inertial SLAM for Unstructured Construction Environment (비정형의 건설환경 매핑을 위한 레이저 반사광 강도와 주변광을 활용한 향상된 라이다-관성 슬램)

  • Jung, Minwoo;Jung, Sangwoo;Jang, Hyesu;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.3
    • /
    • pp.179-188
    • /
    • 2021
  • Construction monitoring is one of the key modules in smart construction. Unlike structured urban environment, construction site mapping is challenging due to the characteristics of an unstructured environment. For example, irregular feature points and matching prohibit creating a map for management. To tackle this issue, we propose a system for data acquisition in unstructured environment and a framework for Intensity and Ambient Enhanced Lidar Inertial Odometry via Smoothing and Mapping, IA-LIO-SAM, that achieves highly accurate robot trajectories and mapping. IA-LIO-SAM utilizes a factor graph same as Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping (LIO-SAM). Enhancing the existing LIO-SAM, IA-LIO-SAM leverages point's intensity and ambient value to remove unnecessary feature points. These additional values also perform as a new factor of the K-Nearest Neighbor algorithm (KNN), allowing accurate comparisons between stored points and scanned points. The performance was verified in three different environments and compared with LIO-SAM.

Noncontact Sleep Efficiency and Stage Estimation for Sleep Apnea Patients Using an Ultra-Wideband Radar (UWB 레이더를 사용한 수면무호흡환자에 대한 비접촉방식 수면효율 및 수면 단계 추정)

  • Park, Sang-Bae;Kim, Jung-Ha
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.23 no.3
    • /
    • pp.433-444
    • /
    • 2020
  • This study proposes a method to improve the sleep stage and efficiency estimation of sleep apnea patients using a UWB (Ultra-Wideband) radar. Motion and respiration extracted from the radar signal were used. Respiratory signal disturbances by motion artifacts and irregular respiration patterns of sleep apnea patients are compensated for in the preprocessing stage. Preprocessing calculates the standard deviation of the respiration signal for a shift window of 15 seconds to estimate thresholds for compensation and applies it to the breathing signal. The method for estimating the sleep stage is based on the difference in amplitude of two kinds of smoothed respirations signals. In smoothing, the window size is set to 10 seconds and 34 seconds, respectively. The estimated feature was processed by the k-nearest neighbor classifier and the feature filtering model to discriminate between the sleep periods of the rapid eye movement (REM) and non-rapid eye movement (NREM). The feature filtering model reflects the characteristics of the REM sleep that occur continuously and the characteristics that mainly occur in the latter part of this stage. The sleep efficiency is estimated by using the sleep onset time and motion events. Sleep onset time uses estimated features from the gradient changes of the breathing signal. A motion event was applied based on the estimated energy change in the UWB signal. Sleep efficiency and sleep stage accuracy were assessed with polysomnography. The average sleep efficiency and sleep stage accuracy were estimated respectively to be about 96.3% and 88.8% in 18 sleep apnea subjects.

A Study on Target Acquisition and Tracking to Develop ARPA Radar (ARPA 레이더 개발을 위한 물표 획득 및 추적 기술 연구)

  • Lee, Hee-Yong;Shin, Il-Sik;Lee, Kwang-Il
    • Journal of Navigation and Port Research
    • /
    • v.39 no.4
    • /
    • pp.307-312
    • /
    • 2015
  • ARPA(Automatic Radar Plotting Aid) is a device to calculate CPA(closest point of approach)/TCPA(time of CPA), true course and speed of targets by vector operation of relative courses and speeds. The purpose of this study is to develop target acquisition and tracking technology for ARPA Radar implementation. After examining the previous studies, applicable algorithms and technologies were developed to be combined and basic ARPA functions were developed as a result. As for main research contents, the sequential image processing technology such as combination of grayscale conversion, gaussian smoothing, binary image conversion and labeling was deviced to achieve a proper target acquisition, and the NNS(Nearest Neighbor Search) algorithm was appllied to identify which target came from the previous image and finally Kalman Filter was used to calculate true course and speed of targets as an analysis of target behavior. Also all technologies stated above were implemented as a SW program and installed onboard, and verified the basic ARPA functions to be operable in practical use through onboard test.

Comparison of Forest Carbon Stocks Estimation Methods Using Forest Type Map and Landsat TM Satellite Imagery (임상도와 Landsat TM 위성영상을 이용한 산림탄소저장량 추정 방법 비교 연구)

  • Kim, Kyoung-Min;Lee, Jung-Bin;Jung, Jaehoon
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.5
    • /
    • pp.449-459
    • /
    • 2015
  • The conventional National Forest Inventory(NFI)-based forest carbon stock estimation method is suitable for national-scale estimation, but is not for regional-scale estimation due to the lack of NFI plots. In this study, for the purpose of regional-scale carbon stock estimation, we created grid-based forest carbon stock maps using spatial ancillary data and two types of up-scaling methods. Chungnam province was chosen to represent the study area and for which the $5^{th}$ NFI (2006~2009) data was collected. The first method (method 1) selects forest type map as ancillary data and uses regression model for forest carbon stock estimation, whereas the second method (method 2) uses satellite imagery and k-Nearest Neighbor(k-NN) algorithm. Additionally, in order to consider uncertainty effects, the final AGB carbon stock maps were generated by performing 200 iterative processes with Monte Carlo simulation. As a result, compared to the NFI-based estimation(21,136,911 tonC), the total carbon stock was over-estimated by method 1(22,948,151 tonC), but was under-estimated by method 2(19,750,315 tonC). In the paired T-test with 186 independent data, the average carbon stock estimation by the NFI-based method was statistically different from method2(p<0.01), but was not different from method1(p>0.01). In particular, by means of Monte Carlo simulation, it was found that the smoothing effect of k-NN algorithm and mis-registration error between NFI plots and satellite image can lead to large uncertainty in carbon stock estimation. Although method 1 was found suitable for carbon stock estimation of forest stands that feature heterogeneous trees in Korea, satellite-based method is still in demand to provide periodic estimates of un-investigated, large forest area. In these respects, future work will focus on spatial and temporal extent of study area and robust carbon stock estimation with various satellite images and estimation methods.