• 제목/요약/키워드: k-NN classifier

Search Result 101, Processing Time 0.027 seconds

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.

Hand Gesture Recognition Regardless of Sensor Misplacement for Circular EMG Sensor Array System (원형 근전도 센서 어레이 시스템의 센서 틀어짐에 강인한 손 제스쳐 인식)

  • Joo, SeongSoo;Park, HoonKi;Kim, InYoung;Lee, JongShill
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.4
    • /
    • pp.371-376
    • /
    • 2017
  • In this paper, we propose an algorithm that can recognize the pattern regardless of the sensor position when performing EMG pattern recognition using circular EMG system equipment. Fourteen features were extracted by using the data obtained by measuring the eight channel EMG signals of six motions for 1 second. In addition, 112 features extracted from 8 channels were analyzed to perform principal component analysis, and only the data with high influence was cut out to 8 input signals. All experiments were performed using k-NN classifier and data was verified using 5-fold cross validation. When learning data in machine learning, the results vary greatly depending on what data is learned. EMG Accuracy of 99.3% was confirmed when using the learning data used in the previous studies. However, even if the position of the sensor was changed by only 22.5 degrees, it was clearly dropped to 67.28% accuracy. The accuracy of the proposed method is 98% and the accuracy of the proposed method is about 98% even if the sensor position is changed. Using these results, it is expected that the convenience of the users using the circular EMG system can be greatly increased.

Anomalous Trajectory Detection in Surveillance Systems Using Pedestrian and Surrounding Information

  • Doan, Trung Nghia;Kim, Sunwoong;Vo, Le Cuong;Lee, Hyuk-Jae
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.4
    • /
    • pp.256-266
    • /
    • 2016
  • Concurrently detected and annotated abnormal events can have a significant impact on surveillance systems. By considering the specific domain of pedestrian trajectories, this paper presents two main contributions. First, as introduced in much of the work on trajectory-based anomaly detection in the literature, only information about pedestrian paths, such as direction and speed, is considered. Differing from previous work, this paper proposes a framework that deals with additional types of trajectory-based anomalies. These abnormal events take places when a person enters prohibited areas. Those restricted regions are constructed by an online learning algorithm that uses surrounding information, including detected pedestrians and background scenes. Second, a simple data-boosting technique is introduced to overcome a lack of training data; such a problem particularly challenges all previous work, owing to the significantly low frequency of abnormal events. This technique only requires normal trajectories and fundamental information about scenes to increase the amount of training data for both normal and abnormal trajectories. With the increased amount of training data, the conventional abnormal trajectory classifier is able to achieve better prediction accuracy without falling into the over-fitting problem caused by complex learning models. Finally, the proposed framework (which annotates tracks that enter prohibited areas) and a conventional abnormal trajectory detector (using the data-boosting technique) are integrated to form a united detector. Such a detector deals with different types of anomalous trajectories in a hierarchical order. The experimental results show that all proposed detectors can effectively detect anomalous trajectories in the test phase.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF

Fault Diagnosis for the Nuclear PWR Steam Generator Using Neural Network (신경회로망을 이용한 원전 PWR 증기발생기의 고장진단)

  • Lee, In-Soo;Yoo, Chul-Jong;Kim, Kyung-Youn
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.6
    • /
    • pp.673-681
    • /
    • 2005
  • As it is the most important to make sure security and reliability for nuclear Power Plant, it's considered the most crucial issues to develop a fault detective and diagnostic system in spite of multiple hardware redundancy in itself. To develop an algorithm for a fault diagnosis in the nuclear PWR steam generator, this paper proposes a method based on ART2(adaptive resonance theory 2) neural network that senses and classifies troubles occurred in the system. The fault diagnosis system consists of fault detective part to sense occurred troubles, parameter estimation part to identify changed system parameters and fault classification part to understand types of troubles occurred. The fault classification part Is composed of a fault classifier that uses ART2 neural network. The Performance of the proposed fault diagnosis a18orithm was corroborated by applying in the steam generator.

Artificial Neural Network for Prediction of Distant Metastasis in Colorectal Cancer

  • Biglarian, Akbar;Bakhshi, Enayatollah;Gohari, Mahmood Reza;Khodabakhshi, Reza
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.3
    • /
    • pp.927-930
    • /
    • 2012
  • Background and Objectives: Artificial neural networks (ANNs) are flexible and nonlinear models which can be used by clinical oncologists in medical research as decision making tools. This study aimed to predict distant metastasis (DM) of colorectal cancer (CRC) patients using an ANN model. Methods: The data of this study were gathered from 1219 registered CRC patients at the Research Center for Gastroenterology and Liver Disease of Shahid Beheshti University of Medical Sciences, Tehran, Iran (January 2002 and October 2007). For prediction of DM in CRC patients, neural network (NN) and logistic regression (LR) models were used. Then, the concordance index (C index) and the area under receiver operating characteristic curve (AUROC) were used for comparison of neural network and logistic regression models. Data analysis was performed with R 2.14.1 software. Results: The C indices of ANN and LR models for colon cancer data were calculated to be 0.812 and 0.779, respectively. Based on testing dataset, the AUROC for ANN and LR models were 0.82 and 0.77, respectively. This means that the accuracy of ANN prediction was better than for LR prediction. Conclusion: The ANN model is a suitable method for predicting DM and in that case is suggested as a good classifier that usefulness to treatment goals.

Automatic Facial Expression Recognition using Tree Structures for Human Computer Interaction (HCI를 위한 트리 구조 기반의 자동 얼굴 표정 인식)

  • Shin, Yun-Hee;Ju, Jin-Sun;Kim, Eun-Yi;Kurata, Takeshi;Jain, Anil K.;Park, Se-Hyun;Jung, Kee-Chul
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.3
    • /
    • pp.60-68
    • /
    • 2007
  • In this paper, we propose an automatic facial expressions recognition system to analyze facial expressions (happiness, disgust, surprise and neutral) using tree structures based on heuristic rules. The facial region is first obtained using skin-color model and connected-component analysis (CCs). Thereafter the origins of user's eyes are localized using neural network (NN)-based texture classifier, then the facial features using some heuristics are localized. After detection of facial features, the facial expression recognition are performed using decision tree. To assess the validity of the proposed system, we tested the proposed system using 180 facial image in the MMI, JAFFE, VAK DB. The results show that our system have the accuracy of 93%.

  • PDF

A Personalized Retrieval System Based on Classification and User Query (분류와 사용자 질의어 정보에 기반한 개인화 검색 시스템)

  • Kim, Kwang-Young;Shim, Kang-Seop;Kwak, Seung-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.43 no.3
    • /
    • pp.163-180
    • /
    • 2009
  • In this paper, we describe a developmental system for establishing personal information tendency based on user queries. For each query, the system classified it based on the category information using a kNN classifier. As category information, we used DDC field which is already assigned to each record in the database. The system accumulates category information for all user queries and the user's personalized feature for the target database. We then developed a personalized retrieval system reflecting the personalized feature to produce search result. Our system re-ranks the result documents by adding more weights to the documents for which categories match with the user's personalized feature. By using user's tendency information, the ambiguity problem of the word could be solved. In this paper, we conducted experiments for personalized search and word sense disambiguation (WSD) on a collection of Korean journal articles of science and technology arena. Our experimental result and user's evaluation show that the performance of the personalized search system and WSD is proved to be useful for actual field services.

Application and Performance Analysis of Machine Learning for GPS Jamming Detection (GPS 재밍탐지를 위한 기계학습 적용 및 성능 분석)

  • Jeong, Inhwan
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.5
    • /
    • pp.47-55
    • /
    • 2019
  • As the damage caused by GPS jamming has been increased, researches for detecting and preventing GPS jamming is being actively studied. This paper deals with a GPS jamming detection method using multiple GPS receiving channels and three-types machine learning techniques. Proposed multiple GPS channels consist of commercial GPS receiver with no anti-jamming function, receiver with just anti-noise jamming function and receiver with anti-noise and anti-spoofing jamming function. This system enables user to identify the characteristics of the jamming signals by comparing the coordinates received at each receiver. In this paper, The five types of jamming signals with different signal characteristics were entered to the system and three kinds of machine learning methods(AB: Adaptive Boosting, SVM: Support Vector Machine, DT: Decision Tree) were applied to perform jamming detection test. The results showed that the DT technique has the best performance with a detection rate of 96.9% when the single machine learning technique was applied. And it is confirmed that DT technique is more effective for GPS jamming detection than the binary classifier techniques because it has low ambiguity and simple hardware. It was also confirmed that SVM could be used only if additional solutions to ambiguity problem are applied.

Welfare Interface using Multiple Facial Features Tracking (다중 얼굴 특징 추적을 이용한 복지형 인터페이스)

  • Ju, Jin-Sun;Shin, Yun-Hee;Kim, Eun-Yi
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.1
    • /
    • pp.75-83
    • /
    • 2008
  • We propose a welfare interface using multiple fecial features tracking, which can efficiently implement various mouse operations. The proposed system consist of five modules: face detection, eye detection, mouth detection, facial feature tracking, and mouse control. The facial region is first obtained using skin-color model and connected-component analysis(CCs). Thereafter the eye regions are localized using neutral network(NN)-based texture classifier that discriminates the facial region into eye class and non-eye class, and then mouth region is localized using edge detector. Once eye and mouth regions are localized they are continuously and correctly tracking by mean-shift algorithm and template matching, respectively. Based on the tracking results, mouse operations such as movement or click are implemented. To assess the validity of the proposed system, it was applied to the interface system for web browser and was tested on a group of 25 users. The results show that our system have the accuracy of 99% and process more than 21 frame/sec on PC for the $320{\times}240$ size input image, as such it can supply a user-friendly and convenient access to a computer in real-time operation.