• Title/Summary/Keyword: k-NN classification

Search Result 188, Processing Time 0.035 seconds

Comparison of Classification Rate Between BP and ANFIS with FCM Clustering Method on Off-line PD Model of Stator Coil

  • Park Seong-Hee;Lim Kee-Joe;Kang Seong-Hwa;Seo Jeong-Min;Kim Young-Geun
    • KIEE International Transactions on Electrophysics and Applications
    • /
    • v.5C no.3
    • /
    • pp.138-142
    • /
    • 2005
  • In this paper, we compared recognition rates between NN(neural networks) and clustering method as a scheme of off-line PD(partial discharge) diagnosis which occurs at the stator coil of traction motor. To acquire PD data, three defective models are made. PD data for classification were acquired from PD detector. And then statistical distributions are calculated to classify model discharge sources. These statistical distributions were applied as input data of two classification tools, BP(Back propagation algorithm) and ANFIS(adaptive network based fuzzy inference system) pre-processed FCM(fuzzy c-means) clustering method. So, classification rate of BP were somewhat higher than ANFIS. But other items of ANFIS were better than BP; learning time, parameter number, simplicity of algorithm.

Classification Protein Subcellular Locations Using n-Gram Features (단백질 서열의 n-Gram 자질을 이용한 세포내 위치 예측)

  • Kim, Jinsuk
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.12-16
    • /
    • 2007
  • The function of a protein is closely co-related with its subcellular location(s). Given a protein sequence, therefore, how to determine its subcellular location is a vitally important problem. We have developed a new prediction method for protein subcellular location(s), which is based on n-gram feature extraction and k-nearest neighbor (kNN) classification algorithm. It classifies a protein sequence to one or more subcellular compartments based on the locations of top k sequences which show the highest similarity weights against the input sequence. The similarity weight is a kind of similarity measure which is determined by comparing n-gram features between two sequences. Currently our method extract penta-grams as features of protein sequences, computes scores of the potential localization site(s) using kNN algorithm, and finally presents the locations and their associated scores. We constructed a large-scale data set of protein sequences with known subcellular locations from the SWISS-PROT database. This data set contains 51,885 entries with one or more known subcellular locations. Our method show very high prediction precision of about 93% for this data set, and compared with other method, it also showed comparable prediction improvement for a test collection used in a previous work.

  • PDF

A Comparative Study on Classification Methods of Sleep Stages by Using EEG

  • Kim, Jinwoo
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.2
    • /
    • pp.113-123
    • /
    • 2014
  • Electrophysiological recordings are considered a reliable method of assessing a person's alertness. Sleep medicine is asked to offer objective methods to measure daytime alertness, tiredness and sleepiness. As EEG signals are non-stationary, the conventional method of frequency analysis is not highly successful in recognition of alertness level. In this paper, EEG signals have been analyzed using wavelet transform as well as discrete wavelet transform and classification using statistical classifiers such as euclidean and mahalanobis distance classifiers and a promising method SVM (Support Vector Machine). As a result of simulation, the average values of accuracies for the Linear Discriminant Analysis (LDA)-Quadratic, k-Nearest Neighbors (k-NN)-Euclidean, and Linear SVM were 48%, 34.2%, and 86%, respectively. The experimental results show that SVM classification method offer the better performance for reliable classification of the EEG signal in comparison with the other classification methods.

Malware Classification System to Support Decision Making of App Installation on Android OS (안드로이드 OS에서 앱 설치 의사결정 지원을 위한 악성 앱 분류 시스템)

  • Ryu, Hong Ryeol;Jang, Yun;Kwon, Taekyoung
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1611-1622
    • /
    • 2015
  • Although Android systems provide a permission-based access control mechanism and demand a user to decide whether to install an app based on its permission list, many users tend to ignore this phase. Thus, an improved method is necessary for users to intuitively make informed decisions when installing a new app. In this paper, with regard to the permission-based access control system, we present a novel approach based on a machine-learning technique in order to support a user decision-making on the fly. We apply the K-NN (K-Nearest Neighbors) classification algorithm with necessary weighted modifications for malicious app classification, and use 152 Android permissions as features. Our experiment shows a superior classification result (93.5% accuracy) compared to other previous work. We expect that our method can help users make informed decisions at the installation step.

Efficient Malware Detector for Android Devices (안드로이드 모바일 단말기를 위한 효율적인 악성앱 감지법)

  • Lee, Hye Lim;Jang, Soohee;Yoon, Ji Won
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.4
    • /
    • pp.617-624
    • /
    • 2014
  • Smart phone usage has increased exponentially and open source based Android OS occupy significant market share. However, various malicious applications that use the characteristic of Android threaten users. In this paper, we construct an efficient malicious application detector by using the principle component analysis and the incremental k nearest neighbor algorithm, which consider an required permission, of Android applications. The cross validation is exploited in order to find a critical parameter of the algorithm. For the performance evaluation of our approach, we simulate a real data set of Contagio Mobile.

Random Forest Based Abnormal ECG Dichotomization using Linear and Nonlinear Feature Extraction (선형-비선형 특징추출에 의한 비정상 심전도 신호의 랜덤포레스트 기반 분류)

  • Kim, Hye-Jin;Kim, Byeong-Nam;Jang, Won-Seuk;Yoo, Sun-K.
    • Journal of Biomedical Engineering Research
    • /
    • v.37 no.2
    • /
    • pp.61-67
    • /
    • 2016
  • This paper presented a method for random forest based the arrhythmia classification using both heart rate (HR) and heart rate variability (HRV) features. We analyzed the MIT-BIH arrhythmia database which contains half-hour ECG recorded from 48 subjects. This study included not only the linear features but also non-linear features for the improvement of classification performance. We classified abnormal ECG using mean_NN (mean of heart rate), SD1/SD2 (geometrical feature of poincare HRV plot), SE (spectral entropy), pNN100 (percentage of a heart rate longer than 100 ms) affecting accurate classification among combined of linear and nonlinear features. We compared our proposed method with Neural Networks to evaluate the accuracy of the algorithm. When we used the features extracted from the HRV as an input variable for classifier, random forest used only the most contributed variable for classification unlike the neural networks. The characteristics of random forest enable the dimensionality reduction of the input variables, increase a efficiency of classifier and can be obtained faster, 11.1% higher accuracy than the neural networks.

Dynamic Emotion Classification through Facial Recognition (얼굴 인식을 통한 동적 감정 분류)

  • Han, Wuri;Lee, Yong-Hwan;Park, Jeho;Kim, Youngseop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.12 no.3
    • /
    • pp.53-57
    • /
    • 2013
  • Human emotions are expressed in various ways. It can be expressed through language, facial expression and gestures. In particular, the facial expression contains many information about human emotion. These vague human emotion appear not in single emotion, but in combination of various emotion. This paper proposes a emotional expression algorithm using Active Appearance Model(AAM) and Fuzz k- Nearest Neighbor which give facial expression in similar with vague human emotion. Applying Mahalanobis distance on the center class, determine inclusion level between center class and each class. Also following inclusion level, appear intensity of emotion. Our emotion recognition system can recognize a complex emotion using Fuzzy k-NN classifier.

A dominant hyperrectangle generation technique of classification using IG partitioning (정보이득 분할을 이용한 분류기법의 지배적 초월평면 생성기법)

  • Lee, Hyeong-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.1
    • /
    • pp.149-156
    • /
    • 2014
  • NGE(Nested Generalized Exemplar Method) can increase the performance of the noisy data at the same time, can reduce the size of the model. It is the optimal distance-based classification method using a matching rule. NGE cross or overlap hyperrectangles generated in the learning has been noted to inhibit the factors. In this paper, We propose the DHGen(Dominant Hyperrectangle Generation) algorithm which avoids the overlapping and the crossing between hyperrectangles, uses interval weights for mixed hyperrectangles to be splited based on the mutual information. The DHGen improves the classification performance and reduces the number of hyperrectangles by processing the training set in an incremental manner. The proposed DHGen has been successfully shown to exhibit comparable classification performance to k-NN and better result than EACH system which implements the NGE theory using benchmark data sets from UCI Machine Learning Repository.

A Feature Analysis of the Power Quality Problem by PCA (PCA를 이용한 전력품질 특징분석)

  • Lee, Jin-Mok;Hong, Duc-Pyo;Kim, Soo-Cheol;Choi, Jae-Ho;Hong, Hyun-Mun
    • Proceedings of the KIPE Conference
    • /
    • 2005.07a
    • /
    • pp.192-194
    • /
    • 2005
  • Development of nonlinear loads and compensation instruments make PQ(Power Quality) problem into important issue. Few studies by signal processing and pattern classification as NN(Neural Network), Wavelet Transform, and Fuzzy present feature extraction. A lot of Input features make not always good result and they are difficult to make realtime system. Thus, The dimentionality reduction is indispensable process. PCA(Principal Component Analysis) reduces high-dimensional input features onto a lower-dimensional subspace effectively. It will be useful to apply to realtime system and NN.

  • PDF

사례기반추론 모델의 최근접 이웃 설정을 위한 Similarity Threshold의 사용

  • Lee, Jae-Sik;Lee, Jin-Cheon
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.588-594
    • /
    • 2005
  • 사례기반추론(Case-Based Reasoning)은 다양한 예측 문제에 있어서 성공적으로 활용되고 있는 데이터마이닝 기법 중 하나이다. 사례기반추론 시스템의 예측 성능은 예측에 사용되는 최근접이웃(Nearest Neighbor)을 어떻게 설정하느냐에 따라 영향을 받게 된다. 따라서 최근접 이웃을 결정짓는 k 값의 설정은 성공적인 사례기반추론 시스템을 구축하기 위한 중요 요인 중 하나가 된다. 최근접 이웃의 설정에 있어서 대부분의 선행 연구들은 고정된 k 값을 사용하는 사례기반추론 시스템은 k 값을 크게 설정할 경우 최근접 이웃 안에 주어진 오류를 일으킬 수 있으며, k 값이 작게 설정된 경우에는 유사 사례 중 일부만을 예측에 사용하기 때문에 예측 결과의 왜곡을 초래할 수 있다. 본 이웃을 결정함에 있어서 Similarity Threshold를 이용하는 s-NN 방법을 제안하였다. 본 연구의 실험을 위해 UCI(University of california, Irvine) Machine Learning Repository에서 제공하는 두 개의 신용 데이터 셋을 사용하였으며, 실험 결과 s-NN 적용한 CBR 모델이 고정된 k 값을 적용한 전통적인 CBR 모델보다 더 우수한 성능을 보여주었다.

  • PDF