• Title/Summary/Keyword: Improved kNN classification method

Search Result 14, Processing Time 0.025 seconds

An Improved Text Classification Method for Sentiment Classification

  • Wang, Guangxing;Shin, Seong Yoon
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.41-48
    • /
    • 2019
  • In recent years, sentiment analysis research has become popular. The research results of sentiment analysis have achieved remarkable results in practical applications, such as in Amazon's book recommendation system and the North American movie box office evaluation system. Analyzing big data based on user preferences and evaluations and recommending hot-selling books and hot-rated movies to users in a targeted manner greatly improve book sales and attendance rate in movies [1, 2]. However, traditional machine learning-based sentiment analysis methods such as the Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) had performed poorly in accuracy. In this paper, an improved kNN classification method is proposed. Through the improved method and normalizing of data, the purpose of improving accuracy is achieved. Subsequently, the three classification algorithms and the improved algorithm were compared based on experimental data. Experiments show that the improved method performs best in the kNN classification method, with an accuracy rate of 11.5% and a precision rate of 20.3%.

Enhancement of Text Classification Method (텍스트 분류 기법의 발전)

  • Shin, Kwang-Seong;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.155-156
    • /
    • 2019
  • Traditional machine learning based emotion analysis methods such as Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) are less accurate. In this paper, we propose an improved kNN classification method. Improved methods and data normalization achieve the goal of improving accuracy. Then, three classification algorithms and an improved algorithm were compared based on experimental data.

  • PDF

An Improved Text Classification (향상된 텍스트 분류)

  • Wang, Guangxing;Shin, Seong-Yoon;Shin, Kwang-Weong;Lee, Hyun-Chang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.125-126
    • /
    • 2019
  • In this paper, we propose an improved kNN classification method. Through improved the mothed and normalizing the data, the purpose of improving the accuracy is achieved. Then we compared the three classification algorithms and the improved algorithm by experimental data.

  • PDF

Automatic Document Classification Based on k-NN Classifier and Object-Based Thesaurus (k-NN 분류 알고리즘과 객체 기반 시소러스를 이용한 자동 문서 분류)

  • Bang Sun-Iee;Yang Jae-Dong;Yang Hyung-Jeong
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.9
    • /
    • pp.1204-1217
    • /
    • 2004
  • Numerous statistical and machine learning techniques have been studied for automatic text classification. However, because they train the classifiers using only feature vectors of documents, ambiguity between two possible categories significantly degrades precision of classification. To remedy the drawback, we propose a new method which incorporates relationship information of categories into extant classifiers. In this paper, we first perform the document classification using the k-NN classifier which is generally known for relatively good performance in spite of its simplicity. We employ the relationship information from an object-based thesaurus to reduce the ambiguity. By referencing various relationships in the thesaurus corresponding to the structured categories, the precision of k-NN classification is drastically improved, removing the ambiguity. Experiment result shows that this method achieves the precision up to 13.86% over the k-NN classification, preserving its recall.

An Improved Genetic Algorithm for Fast Face Detection Using Neural Network as Classifier

  • Sugisaka, Masanori;Fan, Xinjian
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1034-1038
    • /
    • 2005
  • This paper presents a novel method to speed up neural network (NN) based face detection systems. NN-based face detection can be viewed as a classification and search problem. The proposed method formulates the search problem as an integer nonlinear optimization problem (INLP) and develops an improved genetic algorithm (IGA) to solve it. Each individual in the IGA represents a subwindow in an input image. The subwindows are evaluated by how well they match a NN-based face filter. A face is indicated when the filter response of the best particle is above a given threshold. Experimental results show that the proposed method leads to a speedup of 83 on $320{\times}240$ images compared to the traditional exhaustive search method.

  • PDF

Research on Fault Diagnosis of Wind Power Generator Blade Based on SC-SMOTE and kNN

  • Peng, Cheng;Chen, Qing;Zhang, Longxin;Wan, Lanjun;Yuan, Xinpan
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.870-881
    • /
    • 2020
  • Because SCADA monitoring data of wind turbines are large and fast changing, the unbalanced proportion of data in various working conditions makes it difficult to process fault feature data. The existing methods mainly introduce new and non-repeating instances by interpolating adjacent minority samples. In order to overcome the shortcomings of these methods which does not consider boundary conditions in balancing data, an improved over-sampling balancing algorithm SC-SMOTE (safe circle synthetic minority oversampling technology) is proposed to optimize data sets. Then, for the balanced data sets, a fault diagnosis method based on improved k-nearest neighbors (kNN) classification for wind turbine blade icing is adopted. Compared with the SMOTE algorithm, the experimental results show that the method is effective in the diagnosis of fan blade icing fault and improves the accuracy of diagnosis.

A Robust Method for Partially Occluded Face Recognition

  • Xu, Wenkai;Lee, Suk-Hwan;Lee, Eung-Joo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.7
    • /
    • pp.2667-2682
    • /
    • 2015
  • Due to the wide application of face recognition (FR) in information security, surveillance, access control and others, it has received significantly increased attention from both the academic and industrial communities during the past several decades. However, partial face occlusion is one of the most challenging problems in face recognition issue. In this paper, a novel method based on linear regression-based classification (LRC) algorithm is proposed to address this problem. After all images are downsampled and divided into several blocks, we exploit the evaluator of each block to determine the clear blocks of the test face image by using linear regression technique. Then, the remained uncontaminated blocks are utilized to partial occluded face recognition issue. Furthermore, an improved Distance-based Evidence Fusion approach is proposed to decide in favor of the class with average value of corresponding minimum distance. Since this occlusion removing process uses a simple linear regression approach, the completely computational cost approximately equals to LRC and much lower than sparse representation-based classification (SRC) and extended-SRC (eSRC). Based on the experimental results on both AR face database and extended Yale B face database, it demonstrates the effectiveness of the proposed method on issue of partial occluded face recognition and the performance is satisfactory. Through the comparison with the conventional methods (eigenface+NN, fisherfaces+NN) and the state-of-the-art methods (LRC, SRC and eSRC), the proposed method shows better performance and robustness.

An Optimizing Hyperrectangle method for Nearest Hyperrectangle Learning (초월평면 최적화를 이용한 최근접 초월평면 학습법의 성능 향상 방법)

  • Lee, Hyeong-Il
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.3
    • /
    • pp.328-333
    • /
    • 2003
  • NGE (Nested Generalized Exemplars) proposed by Salzberg improved the storage requirement and classification rate of the Memory Based Reasoning. It constructs hyperrectangles during training and performs classification tasks. It worked not bad in many area, however, the major drawback of NGE is constructing hyperrectangles because its hyperrectangle is extended so as to cover the error data and the way of maintaining the feature weight vector. We proposed the OH (Optimizing Hyperrectangle) algorithm which use the feature weight vectors and the ED(Exemplar Densimeter) to optimize resulting Hyperrectangles. The proposed algorithm, as well as the EACH, required only approximately 40% of memory space that is needed in k-NN classifier, and showed a superior classification performance to the EACH. Also, by reducing the number of stored patterns, it showed excellent results in terms of classification when we compare it to the k-NN and the EACH.

Malware Classification System to Support Decision Making of App Installation on Android OS (안드로이드 OS에서 앱 설치 의사결정 지원을 위한 악성 앱 분류 시스템)

  • Ryu, Hong Ryeol;Jang, Yun;Kwon, Taekyoung
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1611-1622
    • /
    • 2015
  • Although Android systems provide a permission-based access control mechanism and demand a user to decide whether to install an app based on its permission list, many users tend to ignore this phase. Thus, an improved method is necessary for users to intuitively make informed decisions when installing a new app. In this paper, with regard to the permission-based access control system, we present a novel approach based on a machine-learning technique in order to support a user decision-making on the fly. We apply the K-NN (K-Nearest Neighbors) classification algorithm with necessary weighted modifications for malicious app classification, and use 152 Android permissions as features. Our experiment shows a superior classification result (93.5% accuracy) compared to other previous work. We expect that our method can help users make informed decisions at the installation step.

Classification of Imbalanced Data Based on MTS-CBPSO Method: A Case Study of Financial Distress Prediction

  • Gu, Yuping;Cheng, Longsheng;Chang, Zhipeng
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.682-693
    • /
    • 2019
  • The traditional classification methods mostly assume that the data for class distribution is balanced, while imbalanced data is widely found in the real world. So it is important to solve the problem of classification with imbalanced data. In Mahalanobis-Taguchi system (MTS) algorithm, data classification model is constructed with the reference space and measurement reference scale which is come from a single normal group, and thus it is suitable to handle the imbalanced data problem. In this paper, an improved method of MTS-CBPSO is constructed by introducing the chaotic mapping and binary particle swarm optimization algorithm instead of orthogonal array and signal-to-noise ratio (SNR) to select the valid variables, in which G-means, F-measure, dimensionality reduction are regarded as the classification optimization target. This proposed method is also applied to the financial distress prediction of Chinese listed companies. Compared with the traditional MTS and the common classification methods such as SVM, C4.5, k-NN, it is showed that the MTS-CBPSO method has better result of prediction accuracy and dimensionality reduction.