• Title/Summary/Keyword: K-nearest neighbor technique

Search Result 78, Processing Time 0.026 seconds

A Stochastic Approach for Prediction of Partially Measured Concentrations of Benzo[a]pyrene in the Ambient Air in Korea

  • Kim, Yongku;Seo, Young-Kyo;Baek, Kyung-Min;Kim, Min-Ji;Baek, Sung-Ok
    • Asian Journal of Atmospheric Environment
    • /
    • v.10 no.4
    • /
    • pp.197-207
    • /
    • 2016
  • Large quantities of air pollutants are released into the atmosphere and hence, must be monitored and routinely assessed for their health implications. This paper proposes a stochastic technique to predict unobserved hazardous air pollutants (HAPs), especially Benzo[a]pyrene (BaP), which can have negative effects on human health. The proposed approach constructs a nearest-neighbor structure by incorporating the linkage between BaP and meteorology and meteorological effects. This approach is adopted in order to predict unobserved BaP concentrations based on observed (or forecasted) meteorological conditions, including temperature, precipitation, wind speed, and air quality. The effects of BaP on human health are examined by characterizing the cancer risk. The efficient prediction provides useful information relating to the optimal monitoring period and projections of future BaP concentrations for both industrial and residential areas within Korea.

Optimization of Case-based Reasoning Systems using Genetic Algorithms: Application to Korean Stock Market (유전자 알고리즘을 이용한 사례기반추론 시스템의 최적화: 주식시장에의 응용)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul;Han, In-Goo
    • Asia pacific journal of information systems
    • /
    • v.16 no.1
    • /
    • pp.71-84
    • /
    • 2006
  • Case-based reasoning (CBR) is a reasoning technique that reuses past cases to find a solution to the new problem. It often shows significant promise for improving effectiveness of complex and unstructured decision making. It has been applied to various problem-solving areas including manufacturing, finance and marketing for the reason. However, the design of appropriate case indexing and retrieval mechanisms to improve the performance of CBR is still a challenging issue. Most of the previous studies on CBR have focused on the similarity function or optimization of case features and their weights. According to some of the prior research, however, finding the optimal k parameter for the k-nearest neighbor (k-NN) is also crucial for improving the performance of the CBR system. In spite of the fact, there have been few attempts to optimize the number of neighbors, especially using artificial intelligence (AI) techniques. In this study, we introduce a genetic algorithm (GA) to optimize the number of neighbors to combine. This study applies the novel approach to Korean stock market. Experimental results show that the GA-optimized k-NN approach outperforms other AI techniques for stock market prediction.

Improvement of location positioning using KNN, Local Map Classification and Bayes Filter for indoor location recognition system

  • Oh, Seung-Hoon;Maeng, Ju-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.6
    • /
    • pp.29-35
    • /
    • 2021
  • In this paper, we propose a method that combines KNN(K-Nearest Neighbor), Local Map Classification and Bayes Filter as a way to increase the accuracy of location positioning. First, in this technique, Local Map Classification divides the actual map into several clusters, and then classifies the clusters by KNN. And posterior probability is calculated through the probability of each cluster acquired by Bayes Filter. With this posterior probability, the cluster where the robot is located is searched. For performance evaluation, the results of location positioning obtained by applying KNN, Local Map Classification, and Bayes Filter were analyzed. As a result of the analysis, it was confirmed that even if the RSSI signal changes, the location information is fixed to one cluster, and the accuracy of location positioning increases.

Improving Web Service Recommendation using Clustering with K-NN and SVD Algorithms

  • Weerasinghe, Amith M.;Rupasingha, Rupasingha A.H.M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1708-1727
    • /
    • 2021
  • In the advent of the twenty-first century, human beings began to closely interact with technology. Today, technology is developing, and as a result, the world wide web (www) has a very important place on the Internet and the significant task is fulfilled by Web services. A lot of Web services are available on the Internet and, therefore, it is difficult to find matching Web services among the available Web services. The recommendation systems can help in fixing this problem. In this paper, our observation was based on the recommended method such as the collaborative filtering (CF) technique which faces some failure from the data sparsity and the cold-start problems. To overcome these problems, we first applied an ontology-based clustering and then the k-nearest neighbor (KNN) algorithm for each separate cluster group that effectively increased the data density using the past user interests. Then, user ratings were predicted based on the model-based approach, such as singular value decomposition (SVD) and the predictions used for the recommendation. The evaluation results showed that our proposed approach has a less prediction error rate with high accuracy after analyzing the existing recommendation methods.

Study of Localization Based on Fingerprinting Technique Using Uplink CSI in Cloud Radio Access Network (클라우드 무선접속 네트워크에서 상향링크 채널 상태 정보를 이용한 핑거프린팅 기반 실내 측위에 관한 연구 시스템)

  • Woo, Sangwoo;Lee, Sangheon;Mun, Cheol
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.2
    • /
    • pp.71-77
    • /
    • 2019
  • With 5G standards proceeding in earnest and increasing demand for services of indoor localization, research on indoor location recognition is being studied in various industrial fields, and research based on fingerprint recognition technology using Wireless Local Area Network (WLAN) is representative. In this paper, we propose an indoor positioning system based on fingerprinting technique that uses Cloud Radio Access Network (C-RAN) architecture and Channel State Information (CSI). In order to improve the performance in indoor positioning, we combined existing fingerprinting method and K nearest neighbor (KNN) technology which is one of the machine running technique. The performance improvements of the proposed indoor positioning system was verified by comparative experiments with the existing localization technique in a indoor localizztion testbed.

Estimation of Forest Biomass based upon Satellite Data and National Forest Inventory Data (위성영상자료 및 국가 산림자원조사 자료를 이용한 산림 바이오매스 추정)

  • Yim, Jong-Su;Han, Won-Sung;Hwang, Joo-Ho;Chung, Sang-Young;Cho, Hyun-Kook;Shin, Man-Yong
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.4
    • /
    • pp.311-320
    • /
    • 2009
  • This study was carried out to estimate forest biomass and to produce forest biomass thematic map for Muju county by combining field data from the 5$^{th}$ National Forest Inventory (2006-2007) and satellite data. For estimating forest biomass, two methods were examined using a Landsat TM-5(taken on April 28th, 2005) and field data: multi-variant regression modeling and t-Nearest Neighbor (k-NN) technique. Estimates of forest biomass by the two methods were compared by a cross-validation technique. The results showed that the two methods provide comparatively accurate estimation with similar RMSE (63.75$\sim$67.26ton/ha) and mean bias ($\pm$1ton/ha). However, it is concluded that the k-NN method for estimating forest biomass is superior in terms of estimation efficiency to the regression model. The total forest biomass of the study site is estimated 8.4 million ton, or 149 ton/ha by the k-NN technique.

Comparison of Forest Growing Stock Estimates by Distance-Weighting and Stratification in k-Nearest Neighbor Technique (거리 가중치와 층화를 이용한 최근린기반 임목축적 추정치의 정확도 비교)

  • Yim, Jong Su;Yoo, Byung Oh;Shin, Man Yong
    • Journal of Korean Society of Forest Science
    • /
    • v.101 no.3
    • /
    • pp.374-380
    • /
    • 2012
  • The k-Nearest Neighbor (kNN) technique is popularly applied to assess forest resources at the county level and to provide its spatial information by combining large area forest inventory data and remote sensing data. In this study, two approaches such as distance-weighting and stratification of training dataset, were compared to improve kNN-based forest growing stock estimates. When compared with five distance weights (0 to 2 by 0.5), the accuracy of kNN-based estimates was very similar ranged ${\pm}0.6m^3/ha$ in mean deviation. The training dataset were stratified by horizontal reference area (HRA) and forest cover type, which were applied by separately and combined. Even though the accuracy of estimates by combining forest cover type and HRA- 100 km was slightly improved, that by forest cover type was more efficient with sufficient number of training data. The mean of forest growing stock based kNN with HRA-100 and stratification by forest cover type when k=7 were somewhat underestimated ($5m^3/ha$) compared to statistical yearbook of forestry at 2011.

Malicious Code Detection using the Effective Preprocessing Method Based on Native API (Native API 의 효과적인 전처리 방법을 이용한 악성 코드 탐지 방법에 관한 연구)

  • Bae, Seong-Jae;Cho, Jae-Ik;Shon, Tae-Shik;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.4
    • /
    • pp.785-796
    • /
    • 2012
  • In this paper, we propose an effective Behavior-based detection technique using the frequency of system calls to detect malicious code, when the number of training data is fewer than the number of properties on system calls. In this study, we collect the Native APIs which are Windows kernel data generated by running program code. Then we adopt the normalized freqeuncy of Native APIs as the basic properties. In addition, the basic properties are transformed to new properties by GLDA(Generalized Linear Discriminant Analysis) that is an effective method to discriminate between malicious code and normal code, although the number of training data is fewer than the number of properties. To detect the malicious code, kNN(k-Nearest Neighbor) classification, one of the bayesian classification technique, was used in this paper. We compared the proposed detection method with the other methods on collected Native APIs to verify efficiency of proposed method. It is presented that proposed detection method has a lower false positive rate than other methods on the threshold value when detection rate is 100%.

Method for Assessing Landslide Susceptibility Using SMOTE and Classification Algorithms (SMOTE와 분류 기법을 활용한 산사태 위험 지역 결정 방법)

  • Yoon, Hyung-Koo
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.6
    • /
    • pp.5-12
    • /
    • 2023
  • Proactive assessment of landslide susceptibility is necessary for minimizing casualties. This study proposes a methodology for classifying the landslide safety factor using a classification algorithm based on machine learning techniques. The high-risk area model is adopted to perform the classification and eight geotechnical parameters are adopted as inputs. Four classification algorithms-namely decision tree, k-nearest neighbor, logistic regression, and random forest-are employed for comparing classification accuracy for the safety factors ranging between 1.2 and 2.0. Notably, a high accuracy is demonstrated in the safety factor range of 1.2~1.7, but a relatively low accuracy is obtained in the range of 1.8~2.0. To overcome this issue, the synthetic minority over-sampling technique (SMOTE) is adopted to generate additional data. The application of SMOTE improves the average accuracy by ~250% in the safety factor range of 1.8~2.0. The results demonstrate that SMOTE algorithm improves the accuracy of classification algorithms when applied to geotechnical data.

Forest Thematic Maps and Forest Statistics Using the k-Nearest Neighbor Technique for Pyeongchang-Gun, Gangwon-Do (kNN 기법을 이용한 강원도 평창군의 산림 주제도 작성과 산림통계량 추정)

  • Yim, Jong-Su;Kong, Gee Su;Kim, Sung Ho;Shin, Man Yong
    • Journal of Korean Society of Forest Science
    • /
    • v.96 no.3
    • /
    • pp.259-268
    • /
    • 2007
  • This study was conducted to produce forest thematic maps and estimate forest statistics for Pyeongchang Gun using the kNN technique, which has been applied to produce thematic maps of variables of interest including unobserved plots by combining field plot data, remotely sensed data and other digital map data in forest inventories. The estimation errors for three horizontal reference areas (HRAs), whose radii are 20, 40 and 60 km respectively, were compared. Although the precision for the 40 km radius was lower compared to that for the 60 km radius, the 40 km radius was found to be an efficient HRA because their difference in precision was modest. At a value of k=5 nearest neighbors for the selected HRA, the overall accuracy was high. As a result, using the k=5 neighbors within the HRA of 40 km radius, thematic maps of number of trees, basal area, and growing stock per hectare were generated. As compared to the forest statistics based on field sample plots, the estimated means of each parameter from the produced maps were underestimated.