• Title/Summary/Keyword: Nearest neighborhood

Search Result 56, Processing Time 0.029 seconds

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.

Depth map Resolution and Quality Enhancement based on Edge preserving interpolation (경계 보존 보간법을 이용한 깊이 영상의 해상도 및 품질 개선)

  • Kim, Ji-Hyun;Choi, Jin-Wook;Sohn, Kwang-Hoon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.07a
    • /
    • pp.39-41
    • /
    • 2011
  • 본 논문에서는 깊이 영상의 해상도와 품질을 향상시키는 방법을 제안한다. 일반적으로 2D-plus-Depth 구조의 3D 콘텐츠에서는 깊이 영상의 품질이 매우 중요하다. 최근 들어 Time-of-Flight (TOF) 방식의 깊이 센서가 깊이 영상 획득에 많이 사용되고 있는데 TOF 깊이 센서가 제공하는 깊이 영상은 저해상도이기 때문에 고해상도 3D 콘텐츠를 제작하기 위해서는 깊이 영상의 해상도를 상향 변환하는 것이 필수적이다. 또한 고품질의 깊이 영상을 얻기 위해서는 물체 간의 경계를 정교하게 보존하는 것이 중요하다. 최근에는 깊이 영상의 해상도 상향 변환을 위해서 Joint Bilateral Upsampling(JBU) 방식이 많이 사용되고 있다. 본 논문은 깊이 영상의 해상도를 높임에 있어서 우선 보간법을 수행하여 영상의 상향 변환 시에 생긴 빈 홀들의 값을 채워준 후 Bilateral Filtering을 수행함으로써 성능을 높인다. 일반적으로 영상을 상향 변환을 할 때 다양한 방법들이 있는데 본 논문에서는 Nearest Neighborhood(NN), Gaussian과 경계 보존 보간법, 경계 보존 보간법과 Fast Curvature Based Interpolation(FCBI)를 결합한 보간법을 사용하였다. 실험 결과 제안 방법이 기존 방법보다 우수한 성능을 가짐을 보여준다. 또한 경계 보존 보간법과 FCBI를 결합한 보간법을 이용해서 상향 변환을 수행한 결과가 다른 보간법들에 의한 결과보다 우수하다는 점을 알 수 있다.

  • PDF

A Spatial Analysis of the Causal Factors Influencing China's Air Pollution

  • Kim, Yoomi;Tanaka, Katsuya;Zhang, Xinxin
    • Asian Journal of Atmospheric Environment
    • /
    • v.11 no.3
    • /
    • pp.194-201
    • /
    • 2017
  • This study investigates the factors that affect China's air pollution using city-level panel data and spatial econometric models. We address three air pollutants ($PM_{10}$, $SO_2$, and $NO_2$) present in 30 cities in China between 2004-2012 using global OLS and spatial models. To develop the spatial econometric analysis, we create a spatial weights matrix to define spatial patterns based on two neighborhood criteria - the queen contiguity and k nearest neighbors. The results show that the estimated coefficients are relatively consistent across different spatial weight criteria. The OLS models indicate that the effect of green spaces is statistically significant in decreasing the concentrations of all air pollutants. In the $PM_{10}$ and $SO_2$ analyses, the OLS models find that the number of buses and population density are also positively related to a reduction in the concentration of air pollutants. In addition, an increase in the temperature and the presence of secondary industries increase $SO_2$ and $NO_2$ concentrations, respectively. All spatial models capture a positive and significant effect of green spaces on reducing the concentration of each air pollutant. Our results suggest that green spaces in cities should receive priority consideration in local planning aimed at sustainable development. Furthermore, policymakers need to be able to discern the differences among pollutants when establishing environmental policies.

Rapid Stitching Method of Digital X-ray Images Using Template-based Registration (템플릿 기반 정합 기법을 이용한 디지털 X-ray 영상의 고속 스티칭 기법)

  • Cho, Hyunji;Kye, Heewon;Lee, Jeongjin
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.6
    • /
    • pp.701-709
    • /
    • 2015
  • Image stitching method is a technique for obtaining an high-resolution image by combining two or more images. In X-ray image for clinical diagnosis, the size of the imaging region taken by one shot is limited due to the field-of-view of the equipment. Therefore, in order to obtain a high-resolution image including large regions such as a whole body, the synthesis of multiple X-ray images is required. In this paper, we propose a rapid stitching method of digital X-ray images using template-based registration. The proposed algorithm use principal component analysis(PCA) and k-nearest neighborhood(k-NN) to determine the location of input images before performing a template-based matching. After detecting the overlapping position using template-based matching, we synthesize input images by alpha blending. To improve the computational efficiency, reduced images are used for PCA and k-NN analysis. Experimental results showed that our method was more accurate comparing with the previous method with the improvement of the registration speed. Our stitching method could be usefully applied into the stitching of 2D or 3D multiple images.

Region-Segmental Scheme in Local Normalization Process of Digital Image (디지털영상 국부정규화처리의 영역분할 구도)

  • Hwang, Jung-Won;Hwang, Jae-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.4 s.316
    • /
    • pp.78-85
    • /
    • 2007
  • This paper presents a segmental scheme for regions-composed images in local normalization process. The scheme is based on local statistics computed through a moving window. The normalization algorithm uses linear or nonlinear functions to transfer the pixel distribution and the homogeneous affine of regions which is corrupted by additive noise. It adjusts the mean and standard deviation for nearest-neighbor interpoint distance between current and the normalized image signals and changes the segmentation performance according to local statistics and parameter variation adaptively. The performance of newly advanced local normalization algorithm is evaluated and compared to the performance of conventional normalization methods. Experimental results are presented to show the region segmentation properties of these approaches.

A Strategy for Neighborhood Selection in Collaborative Filtering-based Recommender Systems (협력 필터링 기반의 추천 시스템을 위한 이웃 선정 전략)

  • Lee, Soojung
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1380-1385
    • /
    • 2015
  • Collaborative filtering is one of the most successfully used methods for recommender systems and has been utilized in various areas such as books and music. The key point of this method is selecting the most proper recommenders, for which various similarity measures have been studied. To improve recommendation performance, this study analyzes problems of existing recommender selection methods based on similarity and presents a method of dynamically determining recommenders based on the rate of co-rated items as well as similarity. Examination of performance with varying thresholds through experiments revealed that the proposed method yielded greatly improved results in both prediction and recommendation qualities, and that in particular, this method showed performance improvements with only a few recommenders satisfying the given thresholds.

Face Recognition Based on PCA and LDA Combining Clustering (Clustering을 결합한 PCA와 LDA 기반 얼굴 인식)

  • Guo, Lian-Hua;Kim, Pyo-Jae;Chang, Hyung-Jin;Choi, Jin-Young
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.387-388
    • /
    • 2006
  • In this paper, we propose an efficient algorithm based on PCA and LDA combining K-means clustering method, which has better accuracy of face recognition than Eigenface and Fisherface. In this algorithm, PCA is firstly used to reduce the dimensionality of original face image. Secondly, a truncated face image data are sub-clustered by K-means clustering method based on Euclidean distances, and all small subclusters are labeled in sequence. Then LDA method project data into low dimension feature space and group data easier to classify. Finally we use nearest neighborhood method to determine the label of test data. To show the recognition accuracy of the proposed algorithm, we performed several simulations using the Yale and ORL (Olivetti Research Laboratory) database. Simulation results show that proposed method achieves better performance in recognition accuracy.

  • PDF

Study on the Take-over Performance of Level 3 Autonomous Vehicles Based on Subjective Driving Tendency Questionnaires and Machine Learning Methods

  • Hyunsuk Kim;Woojin Kim;Jungsook Kim;Seung-Jun Lee;Daesub Yoon;Oh-Cheon Kwon;Cheong Hee Park
    • ETRI Journal
    • /
    • v.45 no.1
    • /
    • pp.75-92
    • /
    • 2023
  • Level 3 autonomous vehicles require conditional autonomous driving in which autonomous and manual driving are alternately performed; whether the driver can resume manual driving within a limited time should be examined. This study investigates whether the demographics and subjective driving tendencies of drivers affect the take-over performance. We measured and analyzed the reengagement and stabilization time after a take-over request from the autonomous driving system to manual driving using a vehicle simulator that supports the driver's take-over mechanism. We discovered that the driver's reengagement and stabilization time correlated with the speeding and wild driving tendency as well as driving workload questionnaires. To verify the efficiency of subjective questionnaire information, we tested whether the driver with slow or fast reengagement and stabilization time can be detected based on machine learning techniques and obtained results. We expect to apply these results to training programs for autonomous vehicles' users and personalized human-vehicle interfaces for future autonomous vehicles.

Medical Image Classification and Retrieval Using Ensemble Combination of Visual Descriptors (시각 기술자들의 앙상블 결합을 이용한 의료 영상 분류와 검색)

  • Ki-Hee Park;Jeong-Hee Shim;Byoung-Chul Ko;Jae-Yeal Nam
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.96-99
    • /
    • 2008
  • 본 논문은 의료 영상을 효과적으로 분류하고 검색 하기 위한 새로운 알고리즘을 제안한다. 의료 영상 중 X-Ray 영상은 어두운 배경에 반해 밝은 전경을 갖고 있기 때문에, 전경의 두드러진 부분에서만 시각 기술자로 추출한다. 우선, 색 구조 기술자(H-CSD)에서 해리스 코너 검출기로 검출한 관심 포인트들에서 색상 특징을 추출하고, 경계선 히스토그램 기술자에서 영상의 전역 및 지역적 질감 특징을 추출한다. 추출된 특징 벡터는 멀티클래스 SVM 에 적용되어 각 영상을 위한 멤버십 스코어를 얻는다. 이후, H-CSD와 EHD 에 대한 SVM 의 멤버십 스코어를 앙상블 결합하여 하나의 특징 벡터로 생성하고, K-nearest Neighborhood 방법을 이용하여 상위-K 개의 영상을 검색을 하도록 하였다. imageCLEFmed2007 을 이용한 실험 결과에서 다른 전역적 속성 또는 분류 기반 검색 방법에 비교하여 보다 개선된 검색 성능을 나타냄을 확인하였다.

Effects of Types and Locational Characteristics of Urban Parks on the Apartment Price (도시공원의 유형 및 입지적 특성이 공동주택가격에 미치는 영향)

  • Lee, Go Eun;Choi, Yeol
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.36 no.5
    • /
    • pp.927-936
    • /
    • 2016
  • This research aims to analyze the effect of different types of urban parks and their locational characteristics on the apartment price in the entire metropolitan area of Busan, Korea. Although an urban park is an environmental good that influences the surroundings in many ways, most of the previous studies have underestimated its impact on the value of the surrounding area. This research focuses on the economic value of urban parks by understanding their relationship with the value of the apartments in the surrounding area with its significance in their physical and objective characteristics. Furthermore, the research emphasizes the different typological characteristics of urban parks in the analysis. In summary of the result, the number of levels (stories) and units of the apartment complex, ranking of the contractor, age of a park and accessibility to sub-central are positively related to the price of apartment units. On the other hand, the total area of apartment complex, the age of apartments, the distance to the nearest park and accessibility to civic-central or regional district are negatively related to the price of apartment units. Having a plan for constructing a park is also positively related to the price. For the typological characters of a park, neighborhood park, small-sized park, and sports park are positively related to the price, while children's park is negatively related to the price of apartment units. Considering that the price increases as the distance to the nearest park decreases, people prefer to live near the benefits that urban parks provide. In order to maximize the value and benefits that parks provide, it is necessary to approach them creatively.