• Title/Summary/Keyword: Cepstral envelope

Search Result 8, Processing Time 0.026 seconds

Formant Synthesis of Haegeum Sounds Using Cepstral Envelope (캡스트럼 포락선을 이용한 해금 소리의 포만트 합성)

  • Hong, Yeon-Woo;Cho, Sang-Jin;Kim, Jong-Myon;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.6
    • /
    • pp.526-533
    • /
    • 2009
  • This paper proposes a formant synthesis method of Haegeum sounds using cepstral envelope for spectral modeling. Spectral modeling synthesis (SMS) is a technique that models time-varying spectra as a combination of sinusoids (the "deterministic" part), and a time-varying filtered noise component (the "stochastic" part). SMS is appropriate for synthesizing sounds of string and wind instruments whose harmonics are evenly distributed over whole frequency band. Formants extracted from cepstral envelope are parameterized for synthesis of sinusoids. A resonator by Impulse Invariant Transform (IIT) is applied to synthesize sinusoids and the results are bandpass filtered to adjust magnitude. The noise is calculated by first generating the sinusoids with formant synthesis, subtracting them from the original sound, and then removing some harmonics remained. Linear interpolation is used to model noise. The synthesized sounds are made by summing sinusoids, which are shown to be similar to the original Haegeum sounds.

Speech Recognition Using Noise Processing in Spectral Dimension (스펙트럴 차원의 잡음처리를 이용한 음성인식)

  • Lee, Gwang-seok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.738-741
    • /
    • 2009
  • This research is concerned for improving the result of speech recognition under the noisy speech. We knew that spectral subtraction and recovery of valleys in spectral envelope obtained from noisy speech are more effective for the improvement of the recognition. In this research, the averaged spectral envelope obtained from vowel spectrums are used for the emphasis of valleys. The vocalic spectral information at lower frequency range is emphasized and the spectrum obtained from consonants is not changed. In simulation, the emphasis coefficients are varied on cepstral domain. This method is used for the recognition of noisy digits and is improved.

  • PDF

An Effective Feature Extraction Method for Fault Diagnosis of Induction Motors (유도전동기의 고장 진단을 위한 효과적인 특징 추출 방법)

  • Nguyen, Hung N.;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.7
    • /
    • pp.23-35
    • /
    • 2013
  • This paper proposes an effective technique that is used to automatically extract feature vectors from vibration signals for fault classification systems. Conventional mel-frequency cepstral coefficients (MFCCs) are sensitive to noise of vibration signals, degrading classification accuracy. To solve this problem, this paper proposes spectral envelope cepstral coefficients (SECC) analysis, where a 4-step filter bank based on spectral envelopes of vibration signals is used: (1) a linear predictive coding (LPC) algorithm is used to specify spectral envelopes of all faulty vibration signals, (2) all envelopes are averaged to get general spectral shape, (3) a gradient descent method is used to find extremes of the average envelope and its frequencies, (4) a non-overlapped filter is used to have centers calculated from distances between valley frequencies of the envelope. This 4-step filter bank is then used in cepstral coefficients computation to extract feature vectors. Finally, a multi-layer support vector machine (MLSVM) with various sigma values uses these special parameters to identify faulty types of induction motors. Experimental results indicate that the proposed extraction method outperforms other feature extraction algorithms, yielding more than about 99.65% of classification accuracy.

Spectral Modeling of Haegeum Using Cepstral Analysis (캡스트럼 분석을 이용한 해금의 스펙트럼 모델링)

  • Hong, Yeon-Woo;Kang, Myeong-Su;Cho, Sang-Jin;Kim, Jong-Myon;Lee, Jung-Chul;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.4
    • /
    • pp.243-250
    • /
    • 2010
  • This paper proposes a spectral modeling of Korean traditional instrument, Haegeum, using cepstral analysis to naturally describe Haegeum sounds varying with time. To get a precise result of cepstral analysis, we set the frame size to 3 periods of input signal and more cepstral coefficients are used to extract formants. The performance is enhanced by flexibly controlling the cutoff frequency of bandpass filter depending on the resonances in the synthesis process of sinusoidal components and the deleting peaks remained in the residual signal. To detect the change of pitch, we divide the input frames into silence, attack, and sustain region and determine which region the current frame is involved in. Then, the proposed method readjusts the frame size according to the fundamental frequency in the case of the current frame is in attack region and corrects the extraction errors of the fundamental frequency for the frames in sustain region. With these processes, the synthesized sounds are much more similar to the originals. The evaluation result through the listening test by a Haegeum player says that the synthesized sounds are almost similar to originals (96~100 % similar to the original sounds).

Normalization of Spectral Magnitude and Cepstral Transformation for Compensation of Lombard Effect (롬바드 효과의 보정을 위한 스펙트럼 크기의 정규화와 켑스트럼 변환)

  • Chi, Sang-Mun;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.83-92
    • /
    • 1996
  • This paper describes Lombard effect compensation and noise suppression so as to reduce speech recognition error in noisy environments. Lombard effect is represented by the variation of spectral envelope of energy normalized word and the variation of overall vocal intensity. The variation of spectral envelope can be compensated by linear transformation in cepstral domain. The variation of vocal intensity is canceled by spectral magnitude normalization. Spectral subtraction is use to suppress noise contamination, and band-pass filtering is used to emphasize dynamic features. To understand Lombard effect and verify the effectiveness of the proposed method, speech data are collected in simulated noisy environments. Recognition experiments were conducted with contamination by noise from automobile cabins, an exhibition hall, telephone booths in down town, crowded streets, and computer rooms. From the experiments, the effectiveness of the proposed method has been confirmed.

  • PDF

Acoustic Signal based Optimal Route Selection Problem: Performance Comparison of Multi-Attribute Decision Making methods

  • Borkar, Prashant;Sarode, M.V.;Malik, L. G.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.647-669
    • /
    • 2016
  • Multiple attribute for decision making including user preference will increase the complexity of route selection process. Various approaches have been proposed to solve the optimal route selection problem. In this paper, multi attribute decision making (MADM) algorithms such as Simple Additive Weighting (SAW), Weighted Product Method (WPM), Analytic Hierarchy Process (AHP) method and Total Order Preference by Similarity to the Ideal Solution (TOPSIS) methods have been proposed for acoustic signature based optimal route selection to facilitate user with better quality of service. The traffic density state conditions (very low, low, below medium, medium, above medium, high and very high) on the road segment is the occurrence and mixture weightings of traffic noise signals (Tyre, Engine, Air Turbulence, Exhaust, and Honks etc) is considered as one of the attribute in decision making process. The short-term spectral envelope features of the cumulative acoustic signals are extracted using Mel-Frequency Cepstral Coefficients (MFCC) and Adaptive Neuro-Fuzzy Classifier (ANFC) is used to model seven traffic density states. Simple point method and AHP has been used for calculation of weights of decision parameters. Numerical results show that WPM, AHP and TOPSIS provide similar performance.

A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments (네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계)

  • Lee, Gil-Ho;Yoon, Jae-Sam;Oh, Yoo-Rhee;Kim, Hong-Kook
    • MALSORI
    • /
    • no.54
    • /
    • pp.27-43
    • /
    • 2005
  • Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.

  • PDF

New Temporal Features for Cardiac Disorder Classification by Heart Sound (심음 기반의 심장질환 분류를 위한 새로운 시간영역 특징)

  • Kwak, Chul;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2
    • /
    • pp.133-140
    • /
    • 2010
  • We improve the performance of cardiac disorder classification by adding new temporal features extracted from continuous heart sound signals. We add three kinds of novel temporal features to a conventional feature based on mel-frequency cepstral coefficients (MFCC): Heart sound envelope, murmur probabilities, and murmur amplitude variation. In cardiac disorder classification and detection experiments, we evaluate the contribution of the proposed features to classification accuracy and select proper temporal features using the sequential feature selection method. The selected features are shown to improve classification accuracy significantly and consistently for neural network-based pattern classifiers such as multi-layer perceptron (MLP), support vector machine (SVM), and extreme learning machine (ELM).