• Title/Summary/Keyword: k nearest neighbor method

Search Result 316, Processing Time 0.027 seconds

Improvements of Multi-features Extraction for EMG for Estimating Wrist Movements (근전도 신호기반 손목 움직임의 추정을 위한 다중 특징점 추출 기법 알고리즘)

  • Kim, Seo-Jun;Jeong, Eui-Chul;Lee, Sang-Min;Song, Young-Rok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.5
    • /
    • pp.757-762
    • /
    • 2012
  • In this paper, the multi feature extraction algorithm for estimation of wrist movements based on Electromyogram(EMG) is proposed. For the extraction of precise features from the EMG signals, the difference absolute mean value(DAMV), the mean absolute value(MAV), the root mean square(RMS) and the difference absolute standard deviation value(DASDV) to consider amplitude characteristic of EMG signals are used. We figure out a more accurate feature-set by combination of two features out of these, because of multi feature extraction algorithm is more precise than single feature method. Also, for the motion classification based on EMG, the linear discriminant analysis(LDA), the quadratic discriminant analysis(QDA) and k-nearest neighbor(k-NN) are used. We implemented a test targeting twenty adult male to identify the accuracy of EMG pattern classification of wrist movements such as up, down, right, left and rest. As a result of our study, the LDA, QDA and k-NN classification method using feature-set with MAV and DASDV showed respectively 87.59%, 89.06%, 91.75% accuracy.

A Distributed High Dimensional Indexing Structure for Content-based Retrieval of Large Scale Data (대용량 데이터의 내용 기반 검색을 위한 분산 고차원 색인 구조)

  • Cho, Hyun-Hwa;Lee, Mi-Young;Kim, Young-Chang;Chang, Jae-Woo;Lee, Kyu-Chul
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.228-237
    • /
    • 2010
  • Although conventional index structures provide various nearest-neighbor search algorithms for high-dimensional data, there are additional requirements to increase search performances as well as to support index scalability for large scale data. To support these requirements, we propose a distributed high-dimensional indexing structure based on cluster systems, called a Distributed Vector Approximation-tree (DVA-tree), which is a two-level structure consisting of a hybrid spill-tree and VA-files. We also describe the algorithms used for constructing the DVA-tree over multiple machines and performing distributed k-nearest neighbors (NN) searches. To evaluate the performance of the DVA-tree, we conduct an experimental study using both real and synthetic datasets. The results show that our proposed method contributes to significant performance advantages over existing index structures on difference kinds of datasets.

Statistical Approach to Noisy Band Removal for Enhancement of HIRIS Image Classification

  • Huan, Nguyen Van;Kim, Hak-Il
    • Proceedings of the KSRS Conference
    • /
    • 2008.03a
    • /
    • pp.195-200
    • /
    • 2008
  • The accuracy of classifying pixels in HIRIS images is usually degraded by noisy bands since noisy bands may deform the typical shape of spectral reflectance. Proposed in this paper is a statistical method for noisy band removal which mainly makes use of the correlation coefficients between bands. Considering each band as a random variable, the correlation coefficient measures the strength and direction of a linear relationship between two random variables. While the correlation between two signal bands is high, existence of a noisy band will produce a low correlation due to ill-correlativeness and undirectedness. The application of the correlation coefficient as a measure for detecting noisy bands is under a two-pass screening scheme. This method is independent of the prior knowledge of the sensor or the cause resulted in the noise. The classification in this experiment uses the unsupervised k-nearest neighbor algorithm in accordance with the well-accepted Euclidean distance measure and the spectral angle mapper measure. This paper also proposes a hierarchical combination of these measures for spectral matching. Finally, a separability assessment based on the between-class and within-class scatter matrices is followed to evaluate the performance.

  • PDF

Simulation of 27Al MQMAS NMR Spectra of Mordenites Using Point Charge Model with First Layer Only and Multiple Layers of Atoms

  • Chae, Seen-Ae;Han, Oc-Hee;Lee, Sang-Yeon
    • Bulletin of the Korean Chemical Society
    • /
    • v.28 no.11
    • /
    • pp.2069-2074
    • /
    • 2007
  • The 27Al multiple quantum magic angle spinning (MQMAS) nuclear magnetic resonance (NMR) spectra of mordenite zeolites were simulated using the point charge model (PCM). The spectra simulated by the PCM considering nearest neighbor atoms only (PCM-n) or including atoms up to the 3rd layer (PCM-m) were not different from those generated by the Hartree-Fock (HF) molecular orbital calculation method. In contrast to the HF and density functional theory methods, the PCM method is simple and convenient to use and does not require sophisticated and expensive computer programs along with specialists to run them. Thus, our results indicate that the spectral simulation of the 27Al MQMAS NMR spectra obtained with the PCM-n is useful, despite its simplicity, especially for porous samples like zeolites with large unit cells and a high volume density of pores. However, it should be pointed out that this conclusion might apply only for the atomic sites with small quadrupole coupling constants.

Improving minority prediction performance of support vector machine for imbalanced text data via feature selection and SMOTE (단어선택과 SMOTE 알고리즘을 이용한 불균형 텍스트 데이터의 소수 범주 예측성능 향상 기법)

  • Jongchan Kim;Seong Jun Chang;Won Son
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.4
    • /
    • pp.395-410
    • /
    • 2024
  • Text data is usually made up of a wide variety of unique words. Even in standard text data, it is common to find tens of thousands of different words. In text data analysis, usually, each unique word is treated as a variable. Thus, text data can be regarded as a dataset with a large number of variables. On the other hand, in text data classification, we often encounter class label imbalance problems. In the cases of substantial imbalances, the performance of conventional classification models can be severely degraded. To improve the classification performance of support vector machines (SVM) for imbalanced data, algorithms such as the Synthetic Minority Over-sampling Technique (SMOTE) can be used. The SMOTE algorithm synthetically generates new observations for the minority class based on the k-Nearest Neighbors (kNN) algorithm. However, in datasets with a large number of variables, such as text data, errors may accumulate. This can potentially impact the performance of the kNN algorithm. In this study, we propose a method for enhancing prediction performance for the minority class of imbalanced text data. Our approach involves employing variable selection to generate new synthetic observations in a reduced space, thereby improving the overall classification performance of SVM.

Classification of Fall Direction Before Impact Using Machine Learning Based on IMU Raw Signals (IMU 원신호 기반의 기계학습을 통한 충격전 낙상방향 분류)

  • Lee, Hyeon Bin;Lee, Chang June;Lee, Jung Keun
    • Journal of Sensor Science and Technology
    • /
    • v.31 no.2
    • /
    • pp.96-101
    • /
    • 2022
  • As the elderly population gradually increases, the risk of fatal fall accidents among the elderly is increasing. One way to cope with a fall accident is to determine the fall direction before impact using a wearable inertial measurement unit (IMU). In this context, a previous study proposed a method of classifying fall directions using a support vector machine with sensor velocity, acceleration, and tilt angle as input parameters. However, in this method, the IMU signals are processed through several processes, including a Kalman filter and the integration of acceleration, which involves a large amount of computation and error factors. Therefore, this paper proposes a machine learning-based method that classifies the fall direction before impact using IMU raw signals rather than processed data. In this study, we investigated the effects of the following two factors on the classification performance: (1) the usage of processed/raw signals and (2) the selection of machine learning techniques. First, as a result of comparing the processed/raw signals, the difference in sensitivities between the two methods was within 5%, indicating an equivalent level of classification performance. Second, as a result of comparing six machine learning techniques, K-nearest neighbor and naive Bayes exhibited excellent performance with a sensitivity of 86.0% and 84.1%, respectively.

Robust Object Tracking based on Kernelized Correlation Filter with multiple scale scheme (다중 스케일 커널화 상관 필터를 이용한 견실한 객체 추적)

  • Yoon, Jun Han;Kim, Jin Heon
    • Journal of IKEEE
    • /
    • v.22 no.3
    • /
    • pp.810-815
    • /
    • 2018
  • The kernelized correlation filter algorithm yielded meaningful results in accuracy for object tracking. However, because of the use of a fixed size template, we could not cope with the scale change of the tracking object. In this paper, we propose a method to track objects by finding the best scale for each frame using correlation filtering response values in multi-scale using nearest neighbor interpolation and Gaussian normalization. The scale values of the next frame are updated using the optimal scale value of the previous frame and the optimal scale value of the next frame is found again. For the accuracy comparison, the validity of the proposed method is verified by using the VOT2014 data used in the existing kernelized correlation filter algorithm.

Selection of Signal Strength and Detection Threshold for Optimal Tracking with Nearest Neighbor Filter (NN 필터 추적을 위한 최적 신호 강도 및 검출 문턱값 선택)

  • Jeong, Yeong-Heon;Gwon, Il-Hwan;Hong, Sun-Mok
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.37 no.3
    • /
    • pp.1-8
    • /
    • 2000
  • In this paper, we formulate an optimal control problem to obtain the optimal signal strength and detection threshold for tracking with NN filter, First, we predict the tracking performance of NN filter by using the HYCA method. Based on this method, the predicted tracking performance is represented with respect to signal strength and detection threshold. Using this relation, we find the optimal parameters for following three examples: 1) the sequence of optimal detection threshold which minimizes sum of position estimation error; 2) the sequence of optimal detection threshold which minimizes sum of validation gate volume; and 3) the sequence of optimal signal strength and detection threshold which minimizes sum of signal strength.

  • PDF

Identification of Differentially Expressed Genes Using Tests Based on Multiple Imputations

  • Kim, Sang Cheol;Yu, Donghyeon
    • Quantitative Bio-Science
    • /
    • v.36 no.1
    • /
    • pp.23-31
    • /
    • 2017
  • Datasets from DNA microarray experiments, which are in the form of large matrices of expression levels of genes, often have missing values. However, the existing statistical methods including the principle components analysis (PCA) and Hotelling's t-test are not directly applicable for the datasets having missing values due to the fact that they assume the observed dataset is complete in general. Many methods have been proposed in previous literature to impute the missing in the observed data. Troyanskaya et al. [1] study the k-nearest neighbor (kNN) imputation, Kim et al. [2] propose the local least squares (LLS) method and Rubin [3] propose the multiple imputation (MI) for missing values. To identify differentially expressed genes, we propose a new testing procedure when the missing exists in the observed data. The proposed procedure uses the Stouffer's z-scores and combines the test results of individual imputed samples, which are dependent to each other. We numerically show that the proposed test procedure based on MI performs better than the existing test procedures based on single imputation (SI) by comparing their ROC curves. We apply the proposed method to analyzing a public microarray data.

Development of methodology for daily rainfall simulation considering distribution of rainfall events in each duration (강우사상의 지속기간별 분포 특성을 고려한 일강우 모의 기법 개발)

  • Jung, Jaewon;Kim, Soojun;Kim, Hung Soo
    • Journal of Korea Water Resources Association
    • /
    • v.52 no.2
    • /
    • pp.141-148
    • /
    • 2019
  • When simulating the daily rainfall amount by existing Markov Chain model, it is general to simulate the rainfall occurrence and to estimate the rainfall amount randomly from the distribution which is similar to the daily rainfall distribution characteristic using Monte Carlo simulation. At this time, there is a limitation that the characteristics of rainfall intensity and distribution by time according to the rainfall duration are not reflected in the results. In this study, 1-day, 2-day, 3-day, 4-day rainfall event are classified, and the rainfall amount is estimated by rainfall duration. In other words, the distributions of the total amount of rainfall event by the duration are set using the Kernel Density Estimation (KDE), the daily rainfall in each day are estimated from the distribution of each duration. Total rainfall amount determined for each event are divided into each daily rainfall considering the type of daily distribution of the rainfall event which has most similar rainfall amount of the observed rainfall using the k-Nearest Neighbor algorithm (KNN). This study is to develop the limitation of the existing rainfall estimation method, and it is expected that this results can use for the future rainfall estimation and as the primary data in water resource design.