• Title/Summary/Keyword: 평균 분산 정규화

Search Result 24, Processing Time 0.02 seconds

Compromised feature normalization method for deep neural network based speech recognition (심층신경망 기반의 음성인식을 위한 절충된 특징 정규화 방식)

  • Kim, Min Sik;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.65-71
    • /
    • 2020
  • Feature normalization is a method to reduce the effect of environmental mismatch between the training and test conditions through the normalization of statistical characteristics of acoustic feature parameters. It demonstrates excellent performance improvement in the traditional Gaussian mixture model-hidden Markov model (GMM-HMM)-based speech recognition system. However, in a deep neural network (DNN)-based speech recognition system, minimizing the effects of environmental mismatch does not necessarily lead to the best performance improvement. In this paper, we attribute the cause of this phenomenon to information loss due to excessive feature normalization. We investigate whether there is a feature normalization method that maximizes the speech recognition performance by properly reducing the impact of environmental mismatch, while preserving useful information for training acoustic models. To this end, we introduce the mean and exponentiated variance normalization (MEVN), which is a compromise between the mean normalization (MN) and the mean and variance normalization (MVN), and compare the performance of DNN-based speech recognition system in noisy and reverberant environments according to the degree of variance normalization. Experimental results reveal that a slight performance improvement is obtained with the MEVN over the MN and the MVN, depending on the degree of variance normalization.

A Study on the Comparison and Evaluation of Spectrum Flattening Techniques (스펙트럼 평탄화 기법의 비교평가에 관한 연구)

  • 강은영;한상일;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.797-800
    • /
    • 2001
  • 스펙트럼의 평탄화는 스펙트럼 신호로부터 포만트의 영향이나 천이진폭의 영향을 제거하는 것이다. 따라서 정확한 피치검출과 포만트검출에 적용할 수 있다. 본 논문에서는 새로운 스펙트럼 평탄화 기법을 제안하고 기존의 방법인 LPC법, Cepstrum법과 비교하여 어느 정도의 우수성을 보이는지 평가하였다. 평가 방법은 각각의 평탄화된 신호의 분산을 구하여 평탄화의 정도를 측정하였다. 이때 핑탄화된 신호는 최고점이 영이 되도 록 정규화 시키고 평균이 영인 분산을 계산하였다. 실험 결과는 제안한 방법이 기존의 방법보다 우수함을 보여 준다.

  • PDF

Robust Feature Normalization Scheme Using Separated Eigenspace in Noisy Environments (분리된 고유공간을 이용한 잡음환경에 강인한 특징 정규화 기법)

  • Lee Yoonjae;Ko Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4
    • /
    • pp.210-216
    • /
    • 2005
  • We Propose a new feature normalization scheme based on eigenspace for achieving robust speech recognition. In general, mean and variance normalization (MVN) is Performed in cepstral domain. However, another MVN approach using eigenspace was recently introduced. in that the eigenspace normalization Procedure Performs normalization in a single eigenspace. This Procedure consists of linear PCA matrix feature transformation followed by mean and variance normalization of the transformed cepstral feature. In this method. 39 dimensional feature distribution is represented using only a single eigenspace. However it is observed to be insufficient to represent all data distribution using only a sin91e eigenvector. For more specific representation. we apply unique na independent eigenspaces to cepstra, delta and delta-delta cepstra respectively in this Paper. We also normalize training data in eigenspace and get the model from the normalized training data. Finally. a feature space rotation procedure is introduced to reduce the mismatch of training and test data distribution in noisy condition. As a result, we obtained a substantial recognition improvement over the basic eigenspace normalization.

A Study on the Pitch Alteration Technique by Sub-band Linear Approximation in Spectrum (서브밴드 선형근사에 의한 피치변경법에 관한 연구)

  • 김영규;김봉영;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2423-2426
    • /
    • 2003
  • 음성합성은 합성방식에 따라 파형부호화법, 신호원부호화법, 혼성부호화법으로 분류할 수 있다. 특히 고음질 합성을 위해서는 파형부호화를 이용한 합성방식이 적합하다 하지만 파형부호화를 이용한 합성법은 여기 성분과 여파기 성분을 분리하지 않고 처리하기 때문에 음절단위나 음소단위의 합성기법으로는 바람직하지 못하다. 따라서 파형부호화법을 규칙에 의한 합성에 적용되도록 음원피치를 변경시키기 위한 피치 변경법이 필요하게 된다. 본 논문에서는 스펙트럼 왜곡을 최소화하기 위해 서브 선형근사에 의하여 스펙트럼 평탄화 시킨 후 스펙트럼 스케일링을 이용하여 피치를 변경하는 방법에 대하여 제안하였다. 기존 방법인 LPC법, Cepstrum법과 비교하여 어느 정도의 우수성을 보이는지 평가하였고 평가방법은 각각의 평탄화 된 신호의 분산을 구하여 평탄화의 정도를 측정하였다. 이때 평탄화 된 신호는 최고점이 영이 되도록 정규화 시키고 평균이 영인 분산을 계산하였다. 제안한 방법의 성능을 평가하기 위해 스펙트럼 왜곡율을 측정하여 본 결과 평균 스펙트럼 왜곡율은 평균 2.12% 이하로 유지되었으며 실험결과 제안한 방법이 기존의 방법보다 우수함을 보여주었다.

  • PDF

Characteristic Analysis of Normalized D-QR-RLS Algorithm (II) (정규화된 D-QR-RLS 알고리즘의 특성 분석(II))

  • Ahn, Bong-Man;Hwang, Jee-Won;Cho, Ju-Phil
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.11C
    • /
    • pp.1127-1133
    • /
    • 2007
  • This paper proposes one of normalized QR-typed LMS (Least Mean Square) algorithms with computational complexity of O(N). This proposed algorithm shows the normalized property in terms of theoretical characteristics. This proposed algorithm is one of algorithms which normalize variance of input signal in terms of mean because QR-typed LMS is proportional to variance of input signal. In this paper, convergence characteristic analysis of normalized algorithm was made. Computer simulation was made by the algorithms used for echo canceller. Proposed algorithm has similar performance to theoretical value. And, we can see that proposed method shows similar one to performance of NLMS.by comparison among different algorithms.

Cepstral Feature Normalization Methods Using Pole Filtering and Scale Normalization for Robust Speech Recognition (강인한 음성인식을 위한 극점 필터링 및 스케일 정규화를 이용한 켑스트럼 특징 정규화 방식)

  • Choi, Bo Kyeong;Ban, Sung Min;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.4
    • /
    • pp.316-320
    • /
    • 2015
  • In this paper, the pole filtering concept is applied to the Mel-frequency cepstral coefficient (MFCC) feature vectors in the conventional cepstral mean normalization (CMN) and cepstral mean and variance normalization (CMVN) frameworks. Additionally, performance of the cepstral mean and scale normalization (CMSN), which uses scale normalization instead of variance normalization, is evaluated in speech recognition experiments in noisy environments. Because CMN and CMVN are usually performed on a per-utterance basis, in case of short utterance, they have a problem that reliable estimation of the mean and variance is not guaranteed. However, by applying the pole filtering and scale normalization techniques to the feature normalization process, this problem can be relieved. Experimental results using Aurora 2 database (DB) show that feature normalization method combining the pole-filtering and scale normalization yields the best improvements.

Cepstrum PDF Normalization Method for Speech Recognition in Noise Environment (잡음환경에서의 음성인식을 위한 켑스트럼의 확률분포 정규화 기법)

  • Suk Yong Ho;Lee Hwang-Soo;Choi Seung Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4
    • /
    • pp.224-229
    • /
    • 2005
  • In this paper, we Propose a novel cepstrum normalization method which normalizes the probability density function (pdf) of cepstrum for robust speech recognition in additive noise environments. While the conventional methods normalize the first- and/or second-order statistics such as the mean and/or variance of the cepstrum. the proposed method fully normalizes the statistics of cepstrum by making the pdfs of clean and noisy cepstrum identical to each other For the target Pdf, the generalized Gaussian distribution is selected to consider various densities. In recognition phase, we devise a table lookup method to save computational costs. From the speaker-independent isolated-word recognition experiments, we show that the Proposed method gives improved Performance compared with that of the conventional methods, especially in heavy noise environments.

Applying feature normalization based on pole filtering to short-utterance speech recognition using deep neural network (심층신경망을 이용한 짧은 발화 음성인식에서 극점 필터링 기반의 특징 정규화 적용)

  • Han, Jaemin;Kim, Min Sik;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.1
    • /
    • pp.64-68
    • /
    • 2020
  • In a conventional speech recognition system using Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), the cepstral feature normalization method based on pole filtering was effective in improving the performance of recognition of short utterances in noisy environments. In this paper, the usefulness of this method for the state-of-the-art speech recognition system using Deep Neural Network (DNN) is examined. Experimental results on AURORA 2 DB show that the cepstral mean and variance normalization based on pole filtering improves the recognition performance of very short utterances compared to that without pole filtering, especially when there is a large mismatch between the training and test conditions.

Identification of Cluster with Composite Mean and Variance (합성된 평균과 분산을 가진 군집 식별)

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.391-401
    • /
    • 2011
  • Consider a cluster, so called a 'son cluster', whose mean and variance is composed of the means and variances of both clusters called as a 'father cluster' and a 'mother cluster'. In this paper, a method for identifying each of three clusters is provided by modeling the relationship with father and mother clusters. Under the normal mixture model, the parameters are estimated via EM algorithm. We were able to overcome the problems of estimation using ECM approximation. Numerical examples show that our method can effectively identify the three clusters, so called a 'family of clusters'.

An Improved Image Classification Using Batch Normalization and CNN (배치 정규화와 CNN을 이용한 개선된 영상분류 방법)

  • Ji, Myunggeun;Chun, Junchul;Kim, Namgi
    • Journal of Internet Computing and Services
    • /
    • v.19 no.3
    • /
    • pp.35-42
    • /
    • 2018
  • Deep learning is known as a method of high accuracy among several methods for image classification. In this paper, we propose a method of enhancing the accuracy of image classification using CNN with a batch normalization method for classification of images using deep CNN (Convolutional Neural Network). In this paper, we propose a method to add a batch normalization layer to existing neural networks to enhance the accuracy of image classification. Batch normalization is a method to calculate and move the average and variance of each batch for reducing the deflection in each layer. In order to prove the superiority of the proposed method, Accuracy and mAP are measured by image classification experiments using five image data sets SHREC13, MNIST, SVHN, CIFAR-10, and CIFAR-100. Experimental results showed that the CNN with batch normalization is better classification accuracy and mAP rather than using the conventional CNN.