• Title/Summary/Keyword: feature reduction

Search Result 594, Processing Time 0.036 seconds

Finding the best suited autoencoder for reducing model complexity

  • Ngoc, Kien Mai;Hwang, Myunggwon
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.9-22
    • /
    • 2021
  • Basically, machine learning models use input data to produce results. Sometimes, the input data is too complicated for the models to learn useful patterns. Therefore, feature engineering is a crucial data preprocessing step for constructing a proper feature set to improve the performance of such models. One of the most efficient methods for automating feature engineering is the autoencoder, which transforms the data from its original space into a latent space. However certain factors, including the datasets, the machine learning models, and the number of dimensions of the latent space (denoted by k), should be carefully considered when using the autoencoder. In this study, we design a framework to compare two data preprocessing approaches: with and without autoencoder and to observe the impact of these factors on autoencoder. We then conduct experiments using autoencoders with classifiers on popular datasets. The empirical results provide a perspective regarding the best suited autoencoder for these factors.

A Study on Human Training System for Prosthetic Arm Control (의수제어를 위한 인체학습시스템에 관한 연구)

  • 장영건;홍승홍
    • Journal of Biomedical Engineering Research
    • /
    • v.15 no.4
    • /
    • pp.465-474
    • /
    • 1994
  • This study is concerned with a method which helps human to generate EMG signals accurately and consistently to make reliable design samples of function discriminator for prosthetic arm control. We intend to ensure a signal accuracy and consistency by training human as a signal generation source. For the purposes, we construct a human training system using a digital computer, which generates visual graphes to compare real target motion trajectory with the desired one, to observe EMG signals and their features. To evaluate the effect which affects a feature variance and a feature separability between motion classes by the human training system, we select 4 features such as integral absolute value, zero crossing counts, AR coefficients and LPC cepstrum coefficients. We perform a experiment four times during 2 months. The experimental results show that the hu- man training system is effective for accurate and consistent EMG signal generation and reduction of a feature variance, but is not correlated for a feature separability, The cepstrum coefficient is the most preferable among the used features for reduction of variance, class separability and robustness to a time varing property of EMG signals.

  • PDF

Relational Discriminant Analysis Using Prototype Reduction Schemes and Mahalanobis Distances (Prototype Reduction Schemes와 Mahalanobis 거리를 이용한 Relational Discriminant Analysis)

  • Kim Sang-Woon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.1 s.307
    • /
    • pp.9-16
    • /
    • 2006
  • RDA(Relational Discriminant Analysis) is a way of finding classifiers based on the dissimilarity measures among the prototypes extracted from feature vectors instead of the feature vectors themselves. Therefore, the accuracy of the RDA classifier is dependent on the methods of selecting prototypes and measuring proximities. In this paper we propose to utilize PRS(Prototype Reduction Schemes) and Mahalanobis distances to devise a method of increasing classification accuracies. Our experimental results demonstrate that the proposed mechanism increases the classification accuracy compared with the conventional approaches for samples involving real-life data sets as well as artificial data sets.

Dimensionality Reduction in Speech Recognition by Principal Component Analysis (음성인식에서 주 성분 분석에 의한 차원 저감)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.9
    • /
    • pp.1299-1305
    • /
    • 2013
  • In this paper, we investigate a method of reducing the computational cost in speech recognition by dimensionality reduction of MFCC feature vectors. Eigendecomposition of the feature vectors renders linear transformation of the vectors in such a way that puts the vector components in order of variances. The first component has the largest variance and hence serves as the most important one in relevant pattern classification. Therefore, we might consider a method of reducing the computational cost and achieving no degradation of the recognition performance at the same time by dimensionality reduction through exclusion of the least-variance components. Experimental results show that the MFCC components might be reduced by about half without significant adverse effect on the recognition error rate.

Feature selection for text data via sparse principal component analysis (희소주성분분석을 이용한 텍스트데이터의 단어선택)

  • Won Son
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.6
    • /
    • pp.501-514
    • /
    • 2023
  • When analyzing high dimensional data such as text data, if we input all the variables as explanatory variables, statistical learning procedures may suffer from over-fitting problems. Furthermore, computational efficiency can deteriorate with a large number of variables. Dimensionality reduction techniques such as feature selection or feature extraction are useful for dealing with these problems. The sparse principal component analysis (SPCA) is one of the regularized least squares methods which employs an elastic net-type objective function. The SPCA can be used to remove insignificant principal components and identify important variables from noisy observations. In this study, we propose a dimension reduction procedure for text data based on the SPCA. Applying the proposed procedure to real data, we find that the reduced feature set maintains sufficient information in text data while the size of the feature set is reduced by removing redundant variables. As a result, the proposed procedure can improve classification accuracy and computational efficiency, especially for some classifiers such as the k-nearest neighbors algorithm.

PCA-Based Feature Reduction for Depth Estimation (깊이 추정을 위한 PCA기반의 특징 축소)

  • Shin, Sung-Sik;Gwun, Ou-Bong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.3
    • /
    • pp.29-35
    • /
    • 2010
  • This paper discusses a method that can enhance the exactness of depth estimation of an image by PCA(Principle Component Analysis) based on feature reduction through learning algorithm. In estimation of the depth of an image, hyphen such as energy of pixels and gradient of them are found, those selves and their relationship are used for depth estimation. In such a case, many features are obtained by various filter operations. If all of the obtained features are equally used without considering their contribution for depth estimation, The efficiency of depth estimation goes down. This paper proposes a method that can enhance the exactness of depth estimation of an image and its processing speed is considered as the contribution factor through PCA. The experiment shows that the proposed method(30% of an feature vector) is more exact(average 0.4%, maximum 2.5%) than using all of an image data in depth estimation.

Feature Analysis of Multi-Channel Time Series EEG Based on Incremental Model (점진적 모델에 기반한 다채널 시계열 데이터 EEG의 특징 분석)

  • Kim, Sun-Hee;Yang, Hyung-Jeong;Ng, Kam Swee;Jeong, Jong-Mun
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.63-70
    • /
    • 2009
  • BCI technology is to control communication systems or machines by brain signal among biological signals followed by signal processing. For the implementation of BCI systems, it is required that the characteristics of brain signal are learned and analyzed in real-time and the learned characteristics are applied. In this paper, we detect feature vector of EEG signal on left and right hand movements based on incremental approach and dimension reduction using the detected feature vector. In addition, we show that the reduced dimension can improve the classification performance by removing unnecessary features. The processed data including sufficient features of input data can reduce the time of processing and boost performance of classification by removing unwanted features. Our experiments using K-NN classifier show the proposed approach 5% outperforms the PCA based dimension reduction.

Parts-based Feature Extraction of Speech Spectrum Using Non-Negative Matrix Factorization (Non-Negative Matrix Factorization을 이용한 음성 스펙트럼의 부분 특징 추출)

  • 박정원;김창근;허강인
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.49-52
    • /
    • 2003
  • In this paper, we propose new speech feature parameter using NMf(Non-Negative Matrix Factorization). NMF can represent multi-dimensional data based on effective dimensional reduction through matrix factorization under the non-negativity constraint, and reduced data present parts-based features of input data. In this paper, we verify about usefulness of NMF algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment result, we could confirm that proposal feature parameter is superior in recognition performance than MFCC(mel frequency cepstral coefficient) that is used generally.

  • PDF

Noise Robust Speaker Identification using Reliable Sub-Band Selection in Multi-Band Approach (신뢰성 높은 서브밴드 선택을 이용한 잡음에 강인한 화자식별)

  • Kim, Sung-Tak;Ji, Mi-Gyeong;Kim, Hoi-Rin
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.127-130
    • /
    • 2007
  • The conventional feature recombination technique is very effective in the band-limited noise condition, but in broad-band noise condition, the conventional feature recombination technique does not produce notable performance improvement compared with the full-band system. To cope with this drawback, we introduce a new technique of sub-band likelihood computation in the feature recombination, and propose a new feature recombination method by using this sub-band likelihood computation. Furthermore, the reliable sub-band selection based on the signal-to-noise ratio is used to improve the performance of this proposed feature recombination. Experimental results shows that the average error reduction rate in various noise condition is more than 27% compared with the conventional full-band speaker identification system.

  • PDF

A study on the Optimal Feature Extraction and Cmplex Adaptive Filter for a speech recognition (음성인식을 위한 복합형잡음제거필터와 최적특징추출에 관한 연구)

  • Cha, T.H.;Jang, S.K.;Choi, U.S;Choi, I.H.;Kim, C.S.
    • Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.55-68
    • /
    • 1998
  • In this paper, a novel method of noise reduction of speech based on a complex adaptive noise canceler and method of optimal feature extraction are proposed. This complex adaptive noise canceler needs simply the noise detection, and LMS algorithm used to calculate the adaptive filter coefficient. The method of optimal feature extraction requires the variance of noise. The experimental results have shown that the proposed method effectively reduced noise in noisy speech. Optimal feature extraction has shown similar characteristics in noise-free speech.

  • PDF