• Title/Summary/Keyword: K-nearest neighbor classification

Search Result 182, Processing Time 0.025 seconds

Random projection ensemble adaptive nearest neighbor classification (랜덤 투영 앙상블 기법을 활용한 적응 최근접 이웃 판별분류기법)

  • Kang, Jongkyeong;Jhun, Myoungshic
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.401-410
    • /
    • 2021
  • Popular in discriminant classification analysis, k-nearest neighbor classification methods have limitations that do not reflect the local characteristic of the data, considering only the number of fixed neighbors. Considering the local structure of the data, the adaptive nearest neighbor method has been developed to select the number of neighbors. In the analysis of high-dimensional data, it is common to perform dimension reduction such as random projection techniques before using k-nearest neighbor classification. Recently, an ensemble technique has been developed that carefully combines the results of such random classifiers and makes final assignments by voting. In this paper, we propose a novel discriminant classification technique that combines adaptive nearest neighbor methods with random projection ensemble techniques for analysis on high-dimensional data. Through simulation and real-world data analyses, we confirm that the proposed method outperforms in terms of classification accuracy compared to the previously developed methods.

Fuzzy Kernel K-Nearest Neighbor Algorithm for Image Segmentation (영상 분할을 위한 퍼지 커널 K-nearest neighbor 알고리즘)

  • Choi Byung-In;Rhee Chung-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.7
    • /
    • pp.828-833
    • /
    • 2005
  • Kernel methods have shown to improve the performance of conventional linear classification algorithms for complex distributed data sets, as mapping the data in input space into a higher dimensional feature space(7). In this paper, we propose a fuzzy kernel K-nearest neighbor(fuzzy kernel K-NN) algorithm, which applies the distance measure in feature space based on kernel functions to the fuzzy K-nearest neighbor(fuzzy K-NN) algorithm. In doing so, the proposed algorithm can enhance the Performance of the conventional algorithm, by choosing an appropriate kernel function. Results on several data sets and segmentation results for real images are given to show the validity of our proposed algorithm.

Enhancement of Text Classification Method (텍스트 분류 기법의 발전)

  • Shin, Kwang-Seong;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.155-156
    • /
    • 2019
  • Traditional machine learning based emotion analysis methods such as Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) are less accurate. In this paper, we propose an improved kNN classification method. Improved methods and data normalization achieve the goal of improving accuracy. Then, three classification algorithms and an improved algorithm were compared based on experimental data.

  • PDF

Performance Improvement of Nearest-neighbor Classification Learning through Prototype Selections (프로토타입 선택을 이용한 최근접 분류 학습의 성능 개선)

  • Hwang, Doo-Sung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.2
    • /
    • pp.53-60
    • /
    • 2012
  • Nearest-neighbor classification predicts the class of an input data with the most frequent class among the near training data of the input data. Even though nearest-neighbor classification doesn't have a training stage, all of the training data are necessary in a predictive stage and the generalization performance depends on the quality of training data. Therefore, as the training data size increase, a nearest-neighbor classification requires the large amount of memory and the large computation time in prediction. In this paper, we propose a prototype selection algorithm that predicts the class of test data with the new set of prototypes which are near-boundary training data. Based on Tomek links and distance metric, the proposed algorithm selects boundary data and decides whether the selected data is added to the set of prototypes by considering classes and distance relationships. In the experiments, the number of prototypes is much smaller than the size of original training data and we takes advantages of storage reduction and fast prediction in a nearest-neighbor classification.

Fuzzy K-Nearest Neighbor Algorithm based on Kernel Method (커널 기반의 퍼지 K-Nearest Neighbor 알고리즘)

  • Choi Byung-In;Rhee Frank Chung-Hoon
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.11a
    • /
    • pp.267-270
    • /
    • 2005
  • 커널 함수는 데이터를 high dimension 상의 속성 공간으로 mapping함으로써 복잡한 분포를 가지는 데이터에 대하여 기존의 선형 분류 알고리즘들의 성능을 향상시킬 수 있다. 본 논문에서는 기존의 유클리디안 거리측정방법 대신에 커널 함수에 의한 속성 공간의 거리측정방법을 fuzzy K-nearest neighbor 알고리즘에 적용한 fuzzy kernel K-nearest neighbor(FKKNN) 알고리즘을 제안한다. 제시한 알고리즘은 데이터에 대한 적절한 커널 함수의 선택으로 기존 알고리즘의 성능을 향상 시킬 수 있다. 제시한 알고리즘의 타당성을 보이기 위하여 여러 데이터 집합에 대한 실험결과를 분석한다.

  • PDF

Optimal dwelling time prediction for package tour using K-nearest neighbor classification algorithm

  • Aria Bisma Wahyutama;Mintae Hwang
    • ETRI Journal
    • /
    • v.46 no.3
    • /
    • pp.473-484
    • /
    • 2024
  • We introduce a machine learning-based web application to help travel agents plan a package tour schedule. K-nearest neighbor (KNN) classification predicts the optimal tourists' dwelling time based on a variety of information to automatically generate a convenient tour schedule. A database collected in collaboration with an established travel agency is fed into the KNN algorithm implemented in the Python language, and the predicted dwelling times are sent to the web application via a RESTful application programming interface provided by the Flask framework. The web application displays a page in which the agents can configure the initial data and predict the optimal dwelling time and automatically update the tour schedule. After conducting a performance evaluation by simulating a scenario on a computer running the Windows operating system, the average response time was 1.762 s, and the prediction consistency was 100% over 100 iterations.

Fast Automatic Modulation Classification by MDC and kNNC (MDC와 kNNC를 이용한 고속 자동변조인식)

  • Park, Cheol-Sun;Yang, Jong-Won;Nah, Sun-Phil;Jang, Won
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.10 no.4
    • /
    • pp.88-96
    • /
    • 2007
  • This paper discusses the fast modulation classifiers capable of classifying both analog and digital modulation signals in wireless communications applications. A total of 7 statistical signal features are extracted and used to classify 9 modulated signals. In this paper, we investigate the performance of the two types of fast modulation classifiers (i.e. 2 nearest neighbor classifiers and 2 minimum distance classifiers) and compare the performance of these classifiers with that of the state of the art for the existing classification methods such as SVM Classifier. Computer simulations indicate good performance on an AWGN channel, even at low signal-to-noise ratios, in case of minimum distance classifiers (MDC for short) and k nearest neighbor classifiers (kNNC for short). Besides a good performance, these type classifiers are considered as ideal candidate to adapt real-time software radio because of their fast modulation classification capability.

An Improved Text Classification Method for Sentiment Classification

  • Wang, Guangxing;Shin, Seong Yoon
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.41-48
    • /
    • 2019
  • In recent years, sentiment analysis research has become popular. The research results of sentiment analysis have achieved remarkable results in practical applications, such as in Amazon's book recommendation system and the North American movie box office evaluation system. Analyzing big data based on user preferences and evaluations and recommending hot-selling books and hot-rated movies to users in a targeted manner greatly improve book sales and attendance rate in movies [1, 2]. However, traditional machine learning-based sentiment analysis methods such as the Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) had performed poorly in accuracy. In this paper, an improved kNN classification method is proposed. Through the improved method and normalizing of data, the purpose of improving accuracy is achieved. Subsequently, the three classification algorithms and the improved algorithm were compared based on experimental data. Experiments show that the improved method performs best in the kNN classification method, with an accuracy rate of 11.5% and a precision rate of 20.3%.

Probabilistic K-nearest neighbor classifier for detection of malware in android mobile (안드로이드 모바일 악성 앱 탐지를 위한 확률적 K-인접 이웃 분류기)

  • Kang, Seungjun;Yoon, Ji Won
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.4
    • /
    • pp.817-827
    • /
    • 2015
  • In this modern society, people are having a close relationship with smartphone. This makes easier for hackers to gain the user's information by installing the malware in the user's smartphone without the user's authority. This kind of action are threats to the user's privacy. The malware characteristics are different to the general applications. It requires the user's authority. In this paper, we proposed a new classification method of user requirements method by each application using the Principle Component Analysis(PCA) and Probabilistic K-Nearest Neighbor(PKNN) methods. The combination of those method outputs the improved result to classify between malware and general applications. By using the K-fold Cross Validation, the measurement precision of PKNN is improved compare to the previous K-Nearest Neighbor(KNN). The classification which difficult to solve by KNN also can be solve by PKNN with optimizing the discovering the parameter k and ${\beta}$. Also the sample that has being use in this experiment is based on the Contagio.

Classification of Traffic Flows into QoS Classes by Unsupervised Learning and KNN Clustering

  • Zeng, Yi;Chen, Thomas M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.2
    • /
    • pp.134-146
    • /
    • 2009
  • Traffic classification seeks to assign packet flows to an appropriate quality of service(QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.