• Title/Summary/Keyword: Voice Activity Detection

Search Result 103, Processing Time 0.027 seconds

Performance Enhancement of Speech Communication System using Reverberation Rejection (잔향제거를 이용한 음성통신 시스템 성능 향상)

  • Kim, Se-Young;Kang, Suk-Youb;Kim, Ki-Man
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.10
    • /
    • pp.2211-2217
    • /
    • 2009
  • In this paper, we propose the speech enhancement algorithm using an one-microphone in a reverberant room environments. Spectral subtraction is the effective method which can reduce the reverberation element and the noise in a spectrum domain. Spectral subtraction needs correct separation of voice section and silent section therefore to improve the performance, voice activity detection(VAD) based on entropy has been applied to the proposed method. We test a performance of the proposed method by comparing with conventional method which used VAD based on energy detection. Reverberation reduction ratio with variable of SNR and a reverberation time is used as a test index. From the simulation result, proposed method shows performance better than conventional method.

Performance Comparison of CDMA and TDMA protocols in radio access system for Integrated Voice/Data Services (음성 및 데이터서비스를 위한 무선접속시스템에서 CDMA와 TDMA방식의 성능비교)

  • 고종하;양영님;이정규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.6A
    • /
    • pp.820-831
    • /
    • 1999
  • In this paper, we have compared the performance of a D-TDMA protocol with that of a CDMA protocol, in radio access system for integrated voice/data services.The D-TDMA protocol is based on a generic dynamic channel assignment approach to be followed a combination of “circuit mode” reservation for voice calls, along with dynamic first-come-first served assignment of remaining capacity for data messages. In the CDMA protocol, we have used the voice activity detection to reduce the interface power of other mobiles in internal and external cells, and analyzed the interference power ratio. Also we have computed BER(Bit Error Rate) by using this interference power ratio and evaluated voice blocking probability(voice packet loss probability) and data transmission delay, according to average data length and average data arrival rate.We have found the CDMA protocol achieves comparatively higher performance for short data length, regardless of data arrival rate. Otherwise, the data transmission delay of D-TDMA protocol is shorter than that of the CDMA protocol for long data message.

  • PDF

Distant-talking of Speech Interface for Humanoid Robots (휴머노이드 로봇을 위한 원거리 음성 인터페이스 기술 연구)

  • Lee, Hyub-Woo;Yook, Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.39-40
    • /
    • 2007
  • For efficient interaction between human and robots, speech interface is a core problem especially in noisy and reverberant conditions. This paper analyzes main issues of spoken language interface for humanoid robots, such as sound source localization, voice activity detection, and speaker recognition.

  • PDF

An Improved Voice Activity Detection Algorithm Employing Speech Enhancement Preprocessing

  • Lee, Yoon-Chang;Ahn, Sang-Sik
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.865-868
    • /
    • 2000
  • In this paper we derive a new VAD algorithm, which combines the preprocessing algorithm and the optimum decision rule. To improve the performance of the VAD algorithm we employ the speech enhancement algorithm and then apply the maximal ratio combining technique in the preprocessing procedure, which leads to maximized output SNR. Moreover, we also perform extensive computer simulations to demonstrate the performance improvement of the proposed algorithm under various background noise environments.

  • PDF

Dimension Reduction Method of Speech Feature Vector for Real-Time Adaptation of Voice Activity Detection (음성구간 검출기의 실시간 적응화를 위한 음성 특징벡터의 차원 축소 방법)

  • Park Jin-Young;Lee Kwang-Seok;Hur Kang-In
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.7 no.3
    • /
    • pp.116-121
    • /
    • 2006
  • In this paper, we propose the dimension reduction method of multi-dimension speech feature vector for real-time adaptation procedure in various noisy environments. This method which reduces dimensions non-linearly to map the likelihood of speech feature vector and noise feature vector. The LRT(Likelihood Ratio Test) is used for classifying speech and non-speech. The results of implementation are similar to multi-dimensional speech feature vector. The results of speech recognition implementation of detected speech data are also similar to multi-dimensional(10-order dimensional MFCC(Mel-Frequency Cepstral Coefficient)) speech feature vector.

  • PDF

Improvement of VAD Performance for the Reduction of the Bit Rate Under the Noise Environment in the G.723.1 (잡음 환경에서의 전송률 감소를 위한 G.723.1 음성활동 검출기 성능 개선에 관한 연구)

  • 김정진;장경아;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.42-47
    • /
    • 2001
  • This paper improves the performance of VAD (Voice Activity Detector) in G.723.1 Annex A 6.3kbps/5.3kbps dual rate speech coder, which is developed for Internet Phone and videoconferencing. The VAD decision is based on a three-level energy threshold. We evaluates for processing time, speech quality, and bit rate. The processing time is reduced due to the accuracy of VAD decision on the silence period. On subjective quality test there is almost no difference compared with the G.723.1. In order to measure the bit rate we count the active speech frame (VAD=1) and we can reduce more bit rate as silence periods are shown.

  • PDF

Speech Enhancement Algorithm Based on Teager Energy and Speech Absence Probability in Noisy Environments (잡음환경에서 Teager 에너지와 음성부재확률 기반의 음성향상 알고리즘)

  • Park, Yun-Sik;An, Hong-Sub;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.3
    • /
    • pp.81-88
    • /
    • 2012
  • In this paper, we propose a novel speech enhancement algorithm for effective noise suppression in various noisy environments. In the proposed method, to result in improved decision performance for speech and noise segments, local speech absence probability (LSAP, local SAP) based on Teager energy of noisy speech is used as the feature parameter for voice activity detection (VAD) in each frequency subband instead of conventional LSAP. In addition, The presented method utilizes global SAP (GSAP) derived in each frame as the weighting parameter for the modification of the adopted TE operator to improve the performance of TE operator. Performances of the proposed algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.

Extraction of Unvoiced Consonant Regions from Fluent Korean Speech in Noisy Environments (잡음환경에서 우리말 연속음성의 무성자음 구간 추출 방법)

  • 박정임;하동경;신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.4
    • /
    • pp.286-292
    • /
    • 2003
  • Voice activity detection (VAD) is a process that separates the noise region from silence or noise region of input speech signal. Since unvoiced consonant signals have very similar characteristics to those of noise signals, it may result in serious distortion of unvoiced consonants, or in erroneous noise estimation to can out VAD without paying special attention on unvoiced consonants. In this paper, we propose a method to extract in an explicit way the boundaries between unvoiced consonant and noise in fluent speech so that more exact VAD could be performed. The proposed method is based on histogram in frequency domain which was successfully used by Hirsch for noise estimation, and a1so on similarity measure of frequency components between adjacent frames, To evaluate the performance of the proposed method, experiments on unvoiced consonant boundary extraction was performed on seven kinds of noisy speech signals of 10 ㏈ and 15 ㏈ SNR respectively.

Implementation of Adaptive Multi Rate (AMR) Vocoder for the Asynchronous IMT-2000 Mobile ASIC (IMT-2000 비동기식 단말기용 ASIC을 위한 적응형 다중 비트율 (AMR) 보코더의 구현)

  • 변경진;최민석;한민수;김경수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.56-61
    • /
    • 2001
  • This paper presents the real-time implementation of an AMR (Adaptive Multi Rate) vocoder which is included in the asynchronous International Mobile Telecommunication (IMT)-2000 mobile ASIC. The implemented AMR vocoder is a multi-rate coder with 8 modes operating at bit rates from 12.2kbps down to 4.75kbps. Not only the encoder and the decoder as basic functions of the vocoder are implemented, but VAD (Voice Activity Detection), SCR (Source Controlled Rate) operation and frame structuring blocks for the system interface are also implemented in this vocoder. The DSP for AMR vocoder implementation is a 16bit fixed-point DSP which is based on the TeakLite core and consists of memory block, serial interface block, register files for the parallel interface with CPU, and interrupt control logic. Through the implementation, we reduce the maximum operating complexity to 24MIPS by efficiently managing the memory structure. The AMR vocoder is verified throughout all the test vectors provided by 3GPP, and stable operation in the real-time testing board is also proved.

  • PDF

Voice Packet Processing Scheme for Voice Quality and Bandwidth Efficiency in VoIP (VoIP의 음성품질/대역효율 개선을 위한 음성패킷 처리)

  • Kim, Jae-Won;Sohn, Dong-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.896-904
    • /
    • 2004
  • In this paper, We present an efficient variable rate speech coder for spectral efficiency and packet processing technique for packet loss compensation of a voice codec with 10msec frame in VoIP service. Through disconnecting the users from the spectral resource during silence interval of about 60% period, a variable rate voice coder based on a voice activity detection(VAD) can increase spectral gain by two times. The performance of the method was analyzed by variation of detected voice activity factor and degraded speech frame ratio under various background noise level, and compared those of G.729B of ITU-T 8kbps standard speech codec. A method to compensate lost packets utilized addition of recovery data to a main stream and error concealment scheme for speech quality enhancement, the performance is verified by reconstructed speech quality. The proposed scheme can achieve spectral gain by two times or enhance speech quality by 3dB through reserved bandwidth of VAD. Therefore, the proposed method can enhance a spectral efficiency or speech quality of VoIP.

  • PDF