• Title/Summary/Keyword: kNN 분류기

Search Result 89, Processing Time 0.023 seconds

Early Multiple Fault Identification of Low-Speed Rolling Element Bearings (저속 구름 베어링의 다중 결함 조기 검출)

  • Kang, Hyunjun;Jeong, In-Kyu;Kang, Myeongsu;Kim, Jong-Myon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.749-752
    • /
    • 2014
  • 본 논문에서는 저속으로 동작하는 구름 베어링의 다중 결함 조기 검출을 위해 결함 특징 추출, 효과적인 특징 선택, 선택된 특징을 이용한 결함 분류의 세 단계로 구성된 결함 진단 기법을 제안한다. 1단계에서 이산 웨이블릿 변환을 이용하여 미세성분으로부터 통계적 결함 특징을 추출하고, DET(distance evaluation technique)를 이용하여 추출한 결함 특징 가운데 베어링 다중 결함 검출에 효과적인 특징을 선택한다. 마지막으로 선택된 특징을 k-NN(k-Nearest Neighbors) 분류기 입력으로 사용함으로써 결함을 진단한다. 본 논문에서는 제안한 결함 진단 기법의 성능을 분류 정확도 측면에서 평가한 결과 95.14%의 높은 분류 정확도를 보였다.

An Experimental Study on Feature Selection Using Wikipedia for Text Categorization (위키피디아를 이용한 분류자질 선정에 관한 연구)

  • Kim, Yong-Hwan;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.155-171
    • /
    • 2012
  • In text categorization, core terms of an input document are hardly selected as classification features if they do not occur in a training document set. Besides, synonymous terms with the same concept are usually treated as different features. This study aims to improve text categorization performance by integrating synonyms into a single feature and by replacing input terms not in the training document set with the most similar term occurring in training documents using Wikipedia. For the selection of classification features, experiments were performed in various settings composed of three different conditions: the use of category information of non-training terms, the part of Wikipedia used for measuring term-term similarity, and the type of similarity measures. The categorization performance of a kNN classifier was improved by 0.35~1.85% in $F_1$ value in all the experimental settings when non-learning terms were replaced by the learning term with the highest similarity above the threshold value. Although the improvement ratio is not as high as expected, several semantic as well as structural devices of Wikipedia could be used for selecting more effective classification features.

Threatening privacy by identifying appliances and the pattern of the usage from electric signal data (스마트 기기 환경에서 전력 신호 분석을 통한 프라이버시 침해 위협)

  • Cho, Jae yeon;Yoon, Ji Won
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.5
    • /
    • pp.1001-1009
    • /
    • 2015
  • In Smart Grid, smart meter sends our electric signal data to the main server of power supply in real-time. However, the more efficient the management of power loads become, the more likely the user's pattern of usage leaks. This paper points out the threat of privacy and the need of security measures in smart device environment by showing that it's possible to identify the appliances and the specific usage patterns of users from the smart meter's data. Learning algorithm PCA is used to reduce the dimension of the feature space and k-NN Classifier to infer appliances and states of them. Accuracy is validated with 10-fold Cross Validation.

Film Line Scratch Detection using a Neural Network based Texture Classifier (신경망 기반의 텍스처 분류기를 이용한 스크래치 검출)

  • Kim, Kyung-Tai;Kim, Eun-Yi
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.6 s.312
    • /
    • pp.26-33
    • /
    • 2006
  • Film restoration is to detect the location and extent of defected regions from a given movie film, and if present, to reconstruct the lost information of each region. It has gained increasing attention by many researchers, to support multimedia service of high quality. In general, an old film is degraded by dust, scratch, flick, and so on. Among these, the most frequent degradation is the scratch. So far techniques for the scratch restoration have been developed, but they have limited applicability when dealing with all kinds of scratches. To fully support the automatic scratch restoration, the system should be developed that can detect all kinds of scratches from a given frame of old films. This paper presents a neurual network (NN)-based texture classifier that automatically detect all kinds of scratches from frames in old films. To facilitate the detection of various scratch sizes, we use a pyramid of images generated from original frames by having the resolution at three levels. The image at each level is scanned by the NN-based classifier, which divides the input image into scratch regions and non-scratch regions. Then, to reduce the computational cost, the NN-based classifier is only applied to the edge pixels. To assess the validity of the proposed method, the experiments have been performed on old films and animations with all kinds of scratches, then the results show the effectiveness of the proposed method.

Learning Bayesian Networks for Text Documents Classification (텍스트 문서 분류를 위한 베이지안망 학습)

  • 황규백;장병탁;김영택
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.262-264
    • /
    • 2000
  • 텍스트 문서 분류는 텍스트 형태로 주어진 문서를 종류별로 구분하는 작업으로 웹페이지 검색, 뉴스 그룹 검색, 메일 필터링 등이 분야에 응용될 수 있는 기반 작업이다. 지금까지 문서를 분류하는데는 k-NN, 신경망 등 여러 가지 기계학습 기법이 이용되어 왔다. 이 논문에서는 베이지안망을 이용해서 텍스트 문서 분류를 행한다. 베이지안망은 다수의 변수들간의 확률적 관계를 표현하는 그래프 모델로 DAG 형태인 망 구조와 각 노드에 연관된 지역확률분포로 구성된다. 그래프 모델을 사용할 경우 학습에 이용되는 각 속성들간의 관계를 사람이 알아보기 쉬운 형태로 학습할 수 있다는 장점이 있다. 실험 데이터로는 Reuters-21578 문서분류데이터를 이용했으며 베이안망의 성능은 나이브 베이즈 분류기와 비슷했다.

  • PDF

A simulation on fall detection system for the elders (노인의 낙상 검출 시스템에 관한 연구)

  • Kim, Dong-Wan;Ryu, Jong-Hyun;Beack, Seung-Hwa
    • Journal of IKEEE
    • /
    • v.17 no.1
    • /
    • pp.22-28
    • /
    • 2013
  • According to a survey, more than 50% of the elders fall which is the most frequent daily safety accident of the elders takes place at home. Furthermore, the elders fall is anticipated to increase as more elderly people are expected to live alone since, 67.1% of the elders of 65 or more do not hope to live with their children. This research aims to verify the fall by measuring and analyzing the floor vibration, and the hardware system was also designed was Piezo Film Sensor, Op-Amp, and DAQ. The system is consists of signal processing part for measuring floor vibration and alarm part for identifying the consciousness of the user when the fall occurs. The fall detection by vibration signals verified by k-Nearest Neighbor verification, and the results showed the error rate of 3.8%.

Malware Detection Method using Opcode and windows API Calls (Opcode와 Windows API를 사용한 멀웨어 탐지)

  • Ahn, Tae-Hyun;Oh, Sang-Jin;Kwon, Young-Man
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.6
    • /
    • pp.11-17
    • /
    • 2017
  • We proposed malware detection method, which use the feature vector that consist of Opcode(operation code) and Windows API Calls extracted from executable files. And, we implemented our feature vector and measured the performance of it by using Bernoulli Naïve Bayes and K-Nearest Neighbor classifier. In experimental result, when using the K-NN classifier with the proposed method, we obtain 95.21% malware detection accuracy. It was better than existing methods using only either Opcode or Windows API Calls.

A preliminary Study on Text Categorization of Book using Table of Contents and Book Description (목차, 책 소개를 이용한 단행본 문서 범주화에 관한 기초연구)

  • Do, Hyun-Ho;Lee, Yong-Gu
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2014.08a
    • /
    • pp.127-130
    • /
    • 2014
  • 이 연구에서는 도서관의 주요 장서에 해당하는 단행본 도서에 대한 자동 분류를 적용가능한지 알아보고자 하였다. 분류자질로 메타데이터인 서명, 목차, 책 소개를 사용하였으며, 다양한 자질 가중치를 적용하여 581건의 단행본 도서를 통해 kNN 분류기의 분류성능을 파악하였다. 실험 결과 이들 메타데이터를 모두 사용하였을 때 가장 좋은 분류성능을 가져왔으며, 실험문헌집단의 규모가 작은 한계가 있지만 로그 TF를 취한 가중치 방법이 좋은 성능을 가져왔다.

  • PDF

Classification Protein Subcellular Locations Using n-Gram Features (단백질 서열의 n-Gram 자질을 이용한 세포내 위치 예측)

  • Kim, Jinsuk
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.12-16
    • /
    • 2007
  • The function of a protein is closely co-related with its subcellular location(s). Given a protein sequence, therefore, how to determine its subcellular location is a vitally important problem. We have developed a new prediction method for protein subcellular location(s), which is based on n-gram feature extraction and k-nearest neighbor (kNN) classification algorithm. It classifies a protein sequence to one or more subcellular compartments based on the locations of top k sequences which show the highest similarity weights against the input sequence. The similarity weight is a kind of similarity measure which is determined by comparing n-gram features between two sequences. Currently our method extract penta-grams as features of protein sequences, computes scores of the potential localization site(s) using kNN algorithm, and finally presents the locations and their associated scores. We constructed a large-scale data set of protein sequences with known subcellular locations from the SWISS-PROT database. This data set contains 51,885 entries with one or more known subcellular locations. Our method show very high prediction precision of about 93% for this data set, and compared with other method, it also showed comparable prediction improvement for a test collection used in a previous work.

  • PDF

A Method of Highspeed Similarity Retrieval based on Self-Organizing Maps (자기 조직화 맵 기반 유사화상 검색의 고속화 수법)

  • Oh, Kun-Seok;Yang, Sung-Ki;Bae, Sang-Hyun;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.515-522
    • /
    • 2001
  • Feature-based similarity retrieval become an important research issue in image database systems. The features of image data are useful to discrimination of images. In this paper, we propose the highspeed k-Nearest Neighbor search algorithm based on Self-Organizing Maps. Self-Organizing Map(SOM) provides a mapping from high dimensional feature vectors onto a two-dimensional space. A topological feature map preserves the mutual relations (similarity) in feature spaces of input data, and clusters mutually similar feature vectors in a neighboring nodes. Each node of the topological feature map holds a node vector and similar images that is closest to each node vector. We implemented about k-NN search for similar image classification as to (1) access to topological feature map, and (2) apply to pruning strategy of high speed search. We experiment on the performance of our algorithm using color feature vectors extracted from images. Promising results have been obtained in experiments.

  • PDF