• Title/Summary/Keyword: 음성검출기

Search Result 137, Processing Time 0.025 seconds

Speech Feature based Double-talk Detector for Acoustic Echo Cancellation (반향제거를 위한 음성특징 기반의 동시통화 검출 기법)

  • Park, Jun-Eun;Lee, Yoon-Jae;Kim, Ki-Hyeon;Ko, Han-Seok
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.132-139
    • /
    • 2009
  • In this paper, a speech feature based double-talk detector method is proposed for an acoustic echo cancellation in hands-free communication system. The double-talk detector is an important element, since it controls the update of the adaptive filter for an acoustic echo cancellation. In previous research, the double talk detector is considered in the signal processing stage without taking the speech characteristics into account. However, in the proposed method, speech features which are used for the speech recognition is used for the discriminative features between the far-end and near-end speech. We obtained a substantial improvement over the previous double-talk detector methods using the only signal in time domain.

  • PDF

Hybrid Commanding Delta Modulation with Silence Detection (묵음 검출 기능을 사용한 하이브리드 압신 델타 변조기)

  • 조동호;은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.19 no.6
    • /
    • pp.84-90
    • /
    • 1982
  • In this paper we exploit the use of the intermittent property of speech to reduce the transmission rate or to increase signal-to-quantization noise ratio (SQNR) in coding speech by hybrid companding data modulation (HCDM). In this scheme we detect silence in speech by a speech/silence discriminator. HCDM coding is done only for speech portion. For silence that is detected in evert block of 5 ms, only the information indicating that the Since the HCDM coder transmits bina교 signal synchronously at a fixed rate, the use of a buffer and its efficient control is essential. By using the HCDM with silence detection in coding speech, we could improve SONR by as much as 6 dB over the conventional HCDM or reduce the transmission rate by one third of the HCDM rate.

  • PDF

A Study on the Pulse-Train Code Excited Linear Prediction Coder: PT-CELP (Pulse-Train code 여기 선형 예측 (PT-CELP) 부호화기에 관한 연구)

  • 김흥국
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.246-249
    • /
    • 1995
  • 4.16kbps의 전송률을 갖는 음성 부호화기 구조에 관하여 기술한다. 제안된 음성 부호화기는 개방 회로 피치 검출기와 이로부터 생성된 pulse train을 코드북으로 갖는 CELP 부호화기이다. Pulse-Train codebook은 분석 프레임별로 부호화 및 복호화 양단에서 생성되며 음성의 피치 및 포만트 정보를 내포하고 있다. 구현된 PT-CELP는 random codebook 방식의 CELP에 비해 적은 크기로 codebook을 만들 수 있으며 음성의 특징을 충분히 반영하므로 합성된 음성의 음질을 향상시킬 수 있다.

  • PDF

Performance Improvement of Double-talk Detector Using Normalized Error Signal Power (정규화된 오차신호 전력을 이용한 동시통화 검출기의 성능 개선)

  • Heo, Won-Chul;Bae, Keun-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.5C
    • /
    • pp.478-486
    • /
    • 2007
  • Double-talk detection errors can result in either large residual echo or distorting the near-end talker's input speech. Thus accurate double-talk detection is an important problem in the acoustic echo canceller to improve the speech quality. In the double-talk detection algorithm using a cross-correlation coefficient, double-talk detection errors can occur in the initial convergence period of an adaptive filter or in noisy environment since the cross-correlation coefficient becomes large in such situations. In this paper, we propose a new double-talk detection algorithm based on the cross-correlation method using a normalized error signal power to reduce the double-talk detection errors. The experimental results have shown the performance improvement of an acoustic echo canceller as well as the noise-robustness of the proposed double-talk detector.

Double Talk Detection before the Convergence of Echo Canceller (반향제거기의 수렴전 동시통화검출)

  • Yoo, Jae-Ha;Kim, Soo-Chan;Kim, Dong-Yon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.5
    • /
    • pp.203-208
    • /
    • 2013
  • In this paper, we proposed a performance improvement method of the double talk detector which can operate before the echo canceller converges. Microphone input signal is filtered by the linear prediction filter and this filtered signal is used for detection. The coefficients of the linear prediction filter are given by the far-end talker signal. During single talk, filtered signal has low power since the characteristics of the echo signal is similar with those of the far-end talker signal. But, during double talk, the filtered signal does not have low power because the signal of different characteristics is included in the microphone signal. Double talk is detected by this difference. Simulations using real speech signals verified that the proposed method outperformed the conventional methods.

Fundamental Frequency Estimation of Voiced Speech Signals Based on the Inflection Point Detection (변곡점 검출에 기반한 음성의 기본 주파수 추정)

  • Byeonggwan Iem
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.472-476
    • /
    • 2023
  • Fundamental frequency/pitch period are major characteristics of speech signals. They are used in many speech applications like speech coding, speech recognition, speaker identification, and so on. In this paper, some of inflection points are used to estimate the pitch which is the inverse of the fundamental frequency. The inflection points are defined as points where local maxima, local minima or the slope changes occur. The speech signal is preprocessed to remove unnecessary inflection points due to the high frequency components using a low pass filter. Only the inflection points from local maxima are used to get the pitch period. While the existing pitch estimation methods process speech signals in blockwise, the proposed method detects the inflection points in sample and produces the pitch period/fundamental frequency estimates along the time. Computer simulation shows the usefulness of the proposed method as a fundamental frequency estimator.

Performance Improvement of Double Talk Detection before Convergence of the Echo Canceller by Using Linear Predictive Coding Filter Gain of the Primary Input Signal (주입력신호의 LPC 필터 이득을 이용한 반향제거기의 수렴전 동시통화검출 성능 개선)

  • Yoo, Jae-Ha
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.628-633
    • /
    • 2014
  • This paper proposes a performance improvement method of the conventional double talk detection method which can operate before convergence of the echo canceller. The proposed method estimates the coefficients of the linear predictive coding(LPC) filter by using the primary input signal. The time-varying threshold for double talk detection is determined based on the LPC filter gain of the primary input signal level. The proposed method can reduce not only false detection rate which means wrong detection of single talk as double talk but also double talk detection delay. Computer simulation was performed using a long-term real speech signals. It is shown that the proposed method improves the conventional method in terms of lowering the false detection rate and shortening the detection delay.

Unknown Word Extractor Development, for ETRI Broadcast News Caption System (ETRI 방송 뉴스 자막 처리 시스템을 위한 미등록어 검출기의 개발)

  • Yun Seung;Jung Eui-Jung;Park Jun;Lee Youngjik
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.163-166
    • /
    • 2002
  • 본 논문에서는 ETRI 방송 뉴스 자막 처리 시스템의 성능 향상을 도모하기 위해 개발된 미등록어 검출기에 대해 기술한다. 음성 인식 성능 하락에 큰 영향을 미치는 요인들 중 하나로 꼽히는 미등록어 문제를 해결하기 위해 ETRI 방송 뉴스 자막 처리 시스템에서는 오프라인으로 동작하는 미등록어 검출기를 채택하였다. 이 미등록어 검출기는 방송 뉴스 자막 처리 시스템 가동 전에 미리 인터넷을 통해 최신 신문 기사와 방송 뉴스를 수집해와 이를 토대로 두 단계에 걸쳐 미등록어를 사전에 추출하여 인식 어휘 사전에 포함시킴으로써 미등록어로 인한 방송 뉴스의 인식 성능 저하 문제를 해결하도록 하였다.

  • PDF

Dimension Reduction Method of Speech Feature Vector for Real-Time Adaptation of Voice Activity Detection (음성구간 검출기의 실시간 적응화를 위한 음성 특징벡터의 차원 축소 방법)

  • Park Jin-Young;Lee Kwang-Seok;Hur Kang-In
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.7 no.3
    • /
    • pp.116-121
    • /
    • 2006
  • In this paper, we propose the dimension reduction method of multi-dimension speech feature vector for real-time adaptation procedure in various noisy environments. This method which reduces dimensions non-linearly to map the likelihood of speech feature vector and noise feature vector. The LRT(Likelihood Ratio Test) is used for classifying speech and non-speech. The results of implementation are similar to multi-dimensional speech feature vector. The results of speech recognition implementation of detected speech data are also similar to multi-dimensional(10-order dimensional MFCC(Mel-Frequency Cepstral Coefficient)) speech feature vector.

  • PDF