• Title/Summary/Keyword: Speech detection

Search Result 469, Processing Time 0.025 seconds

Optimization of Detection Method Using a Moving Average Estimator for Speech Enhancement (음성강화를 위한 이동 평균 예측량 기반의 검출방법 최적화)

  • Lee, Soo-Jeong;Shin, Kye-Hyeon;Kim, Soon-Hyob
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.3
    • /
    • pp.97-104
    • /
    • 2007
  • Adaptive echo canceller(AEC) has become an important component in speech communication systems, including mobile phones and speech recognition. In these applications, the acoustic echo path has a long impulse response. We propose a moving-averge least mean square(MVLMS) algorithm with a detection method for acoustic echo cancellation. Using, the result of the tests that used colored input models clearly shows that the MVLMS detection algorithm has convergence performance superior to the least mean square(LMS) detection algorithm alone. Although the computational complexity of the new MVLMS algorithm is only slightly greater than that of the standard LMS detection algorithm, the new algorithm confers a significant improvement in stability.

Voice Activity Detection Algorithm using Fuzzy Membership Shifted C-means Clustering in Low SNR Environment (낮은 신호 대 잡음비 환경에서의 퍼지 소속도 천이 C-means 클러스터링을 이용한 음성구간 검출 알고리즘)

  • Lee, G.H.;Lee, Y.J.;Cho, J.H.;Kim, M.N.
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.3
    • /
    • pp.312-323
    • /
    • 2014
  • Voice activity detection is very important process that find voice activity from noisy speech signal for noise cancelling and speech enhancement. Over the past few years, many studies have been made on voice activity detection, it has poor performance for speech signal of sentence form in a low SNR environment. In this paper, it proposed new voice activity detection algorithm that has beginning VAD process using entropy and main VAD process using fuzzy membership shifted c-means clustering. We conduct an experiment in various SNR environment of white noise to evaluate performance of the proposed algorithm and confirmed good performance of the proposed algorithm.

Robust End Point Detection for Robot Speech Recognition Using Double Talk Detection (음성인식 로봇을 위한 동시통화검출 기반의 강인한 음성 끝점 검출)

  • Moon, Sung-Kyu;Park, Jin-Soo;Ko, Han-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.161-169
    • /
    • 2012
  • This paper presents a robust speech end-point detector using double talk detection in echoic conditioned speech recognition robot. The proposed method consists of combining conventional end-point detector result and double talk detector result. We have tested the proposed method in isolated word recognition system under echoic conditioned environment. As a result, the proposed algorithm shows superior performance of 30 % to the available techniques in the points of speech recognition rates.

Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments (잡음환경에서 Teager Energy 기반의 전역 음성부재확률을 이용하는 음성검출)

  • Park, Yun-Sik;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.97-103
    • /
    • 2012
  • In this paper, we propose a novel voice activity detection (VAD) algorithm to effectively distinguish speech from nonspeech in various noisy environments. Global speech absence probability (GSAP) derived from likelihood ratio (LR) based on the statistical model is widely used as the feature parameter for VAD. However, the feature parameter based on conventional GSAP is not sufficient to distinguish speech from noise at low SNRs (signal-to-noise ratios). The presented VAD algorithm utilizes GSAP based on Teager energy (TE) as the feature parameter to provide the improved performance of decision for speech segments in noisy environment. Performances of the proposed VAD algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.

Performance Comparison of Automatic Detection of Laryngeal Diseases by Voice (후두질환 음성의 자동 식별 성능 비교)

  • Kang Hyun Min;Kim Soo Mi;Kim Yoo Shin;Kim Hyung Soon;Jo Cheol-Woo;Yang Byunggon;Wang Soo-Geun
    • MALSORI
    • /
    • no.45
    • /
    • pp.35-45
    • /
    • 2003
  • Laryngeal diseases cause significant changes in the quality of speech production. Automatic detection of laryngeal diseases by voice is attractive because of its nonintrusive nature. In this paper, we apply speech recognition techniques to detection of laryngeal cancer, and investigate which feature parameters and classification methods are appropriate for this purpose. Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) are examined as feature parameters, and parameters reflecting the periodicity of speech and its perturbation are also considered. As for classifier, multilayer perceptron neural networks and Gaussian Mixture Models (GMM) are employed. According to our experiments, higher order LPCC with the periodic information parameters yields the best performance.

  • PDF

On a Detection of V-UV Segments of Speech Spectrum for the MBE Coding (MBE 부호화용 스펙트럼 V-UV 구간 검출에 관한 연구)

  • 김을제
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1992.06a
    • /
    • pp.43-48
    • /
    • 1992
  • In the area of speech vocoder systems, the MBE vocoder allows the high quality and low bit rate. In the MBE parameters detection, the dicision methods of V/UV region proposed until now are dependent highly to the other parameters, fundamental frequency and formant information. In this paper, thus, we propose a new V/UV detection method that uses a zero-crossing rate of flatten harmonices spectrum. This method can reduce the influences of the other parameters for the V/UV regions detection.

  • PDF

Coding History Detection of Speech Signal using Deep Neural Network (심층 신경망을 이용한 음성 신호의 부호화 이력 검출)

  • Cho, Hyo-Jin;Jang, Won;Shin, Seong-Hyeon;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.1
    • /
    • pp.86-92
    • /
    • 2018
  • In this paper, we propose a method for coding history detection of digital speech signal. In digital speech communication and storage, the signal is encoded to reduce the number of bits. Therefore, when a speech signal waveform is given, we need to detect its coding history so that we can determine whether the signal is an original or an coded one, and if coded, determine the number of times of coding. In this paper, we propose a coding history detection method for 12.2kbps AMR codec in terms of original, single coding, and double coding. The proposed method extracts a speech-specific feature vector from the given speech, and models the feature vector using a deep neural network. We confirm that the proposed feature vector provides better performance in coding history detection than the feature vector computed from the general spectrogram.

On the Frequency Domain Pitch Detection of Noise Corrupted Speech Signals -Minimizing the Effects of the F1 by the Spectral AMDF- (배경잡음하에서 주파수영역 피치검출에 관한 연구 -스펙트럼 AMDF에 의한 제 1포먼트 영향 제거법-)

  • Bae, Myung-Jin;Park, Chan-Sou;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.12-18
    • /
    • 1991
  • Detecting the fundamental frequency(Fo) of the speech signal is a problem in many speech applications. A problem of the pitch detection method in the frequency domain is occurred by the first formant and the background noise. Thus, in this paper, we proposed a pitch detection algorithm in the frequency domain that reduces the effects of the first formant and the background noise by the spectral AMDF function. Several computer simulation results showed that the proposed algorithm was very effective for fundamental frequency detection.

  • PDF

A New Least Mean Square Algorithm Using a Running Average Process for Speech Enhancement

  • Lee, Soo-Jeong;Ahn, Chan-Sik;Yun, Jong-Mu;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.3E
    • /
    • pp.123-130
    • /
    • 2006
  • The adaptive echo canceller (AEC) has become an important component in speech communication systems, including mobile station. In these applications, the acoustic echo path has a long impulse response. We propose a running-average least mean square (RALMS) algorithm with a detection method for acoustic echo cancellation. Using colored input models, the result clearly shows that the RALMS detection algorithm has a convergence performance superior to the least mean square (LMS) detection algorithm alone. The computational complexity of the new RALMS algorithm is only slightly greater than that of the standard LMS detection algorithm but confers a major improvement in stability.

Reduction of Environmental Background Noise using Speech and Noise Recognition (음성 및 잡음 인식 알고리즘을 이용한 환경 배경잡음의 제거)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.4
    • /
    • pp.817-822
    • /
    • 2011
  • This paper first proposes the speech recognition algorithm by detection of the speech and noise sections at each frame using a neural network training by back-propagation algorithm, then proposes the spectral subtraction method which removes the noises at each frame according to detection of the speech and noise sections. In this experiment, the performance of the proposed recognition system was evaluated based on the recognition rate using various speeches that are degraded by white noise and car noise. Moreover, experimental results of the noise reduction by the spectral subtraction method demonstrate using the speech and noise sections detecting by the speech recognition algorithm at each frame. Based on measuring signal-to-noise ratio, experiments confirm that the proposed algorithm is effective for the speech by corrupted the noise using signal-to-noise ratio.