• Title/Summary/Keyword: Voice codec

Search Result 76, Processing Time 0.026 seconds

Improvement of Speech/Music Classification Based on RNN in EVS Codec for Hearing Aids (EVS 코덱에서 보청기를 위한 RNN 기반의 음성/음악 분류 성능 향상)

  • Kang, Sang-Ick;Lee, Sang Min
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.2
    • /
    • pp.143-146
    • /
    • 2017
  • In this paper, a novel approach is proposed to improve the performance of speech/music classification using the recurrent neural network (RNN) in the enhanced voice services (EVS) of 3GPP for hearing aids. Feature vectors applied to the RNN are selected from the relevant parameters of the EVS for efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and large speech/music data. The proposed algorithm yields better results compared with the conventional scheme implemented in the EVS.

Adaptive Multi-Rate(AMR) Speech Coding Algorithm (Adaptive Multi-Rate(AMR) 음성부호화 알고리즘)

  • 서정욱;배건성
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.92-97
    • /
    • 2000
  • An AMR(Adaptive Multi-Rate) speech coding algorithm has been adopted as a standard speech codec for IMT-2000. It is based on the algebraic CELP, and consists of eight speech coding modes having the bit rate from 4.75 kbit/s to 12.2 kbit/s. It also contains the VAD(Voice Activity Detector), SCR (Source Controlled Rate) operation, and error concealment scheme for robustness in a radio channel. The bit rate of AMR is changed on a frame basis depending on the channel condition. In this paper, we introduced AMR speech coding algorithm and performed the real-time implementation using TMS320C6201, i.e., a Texas Instrument's fixed-point DSP. With the ANSI C source code released from ETSI and 3GPP, we convert and optimize the program to make it run in real time using the C compiler and assembly language. It is verified that the decoded result of the implemented speech codec on the DSP is identical with the PC simulation result using ANSI C code for test sequences. Also, actual sound input/output test using microphone and speaker demonstrates its proper real-time operation without distortions or delays.

  • PDF

Voice Packet Processing Scheme for Voice Quality and Bandwidth Efficiency in VoIP (VoIP의 음성품질/대역효율 개선을 위한 음성패킷 처리)

  • Kim, Jae-Won;Sohn, Dong-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.896-904
    • /
    • 2004
  • In this paper, We present an efficient variable rate speech coder for spectral efficiency and packet processing technique for packet loss compensation of a voice codec with 10msec frame in VoIP service. Through disconnecting the users from the spectral resource during silence interval of about 60% period, a variable rate voice coder based on a voice activity detection(VAD) can increase spectral gain by two times. The performance of the method was analyzed by variation of detected voice activity factor and degraded speech frame ratio under various background noise level, and compared those of G.729B of ITU-T 8kbps standard speech codec. A method to compensate lost packets utilized addition of recovery data to a main stream and error concealment scheme for speech quality enhancement, the performance is verified by reconstructed speech quality. The proposed scheme can achieve spectral gain by two times or enhance speech quality by 3dB through reserved bandwidth of VAD. Therefore, the proposed method can enhance a spectral efficiency or speech quality of VoIP.

  • PDF

QCELP Implementation on TMS320C30 DSP Board TMS320C30 DSP를 이용한 QCELP Codec의 실현

  • Han, Kyong-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1E
    • /
    • pp.83-87
    • /
    • 1995
  • The implementation of the voice dodec is imjplemented by using TMS320C30, which is the floating point DSP chip from Texas Instrument. QCELP (Qualcomm Code Excited Linear Prediction) is used to encode and decode the voice. The QCELP code is implemented by the TMS320C30 C-dode. The DSP board is controlled by the PC. The PC program tranfors the voice file from and to the DSP board, which is also implemented by C-code. The voice is encoded by the DSP board and the encoded data is transferred to PC to be stored as a file. To hear the voice. the voice data file is sent to DSP board and decoded to synthesize audible voice. Two flags are used by both programs to notify the status of the operation. By checking the flags, DSP and PC decides when the voice data is transferred between them.

  • PDF

Effects of communication environment on VoIP capacity using WiFi (통신환경이 WiFi를 이용한 VoIP 서비스 용량에 미치는 영향)

  • Choi, Dae-Woo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.6
    • /
    • pp.1327-1332
    • /
    • 2015
  • In this paper, we studied several aspects that affect the quality of VoIP using WiFi network. It's clear that the background data traffic within an AP, the end-to-end delay and the traffic loss of TCP/IP network gives serious effects on the voice quality. A kind of access control for the VoIP connection within an AP should be done for the acceptable voice quality.

A Study on Channel Decoder MAP Estimation Based on H.264 Syntax Rule (H-264 동영상 압축의 문법적 제한요소를 이용한 MAP기반의 Channel Decoder 성능 향상에 대한 연구)

  • Jeon, Yong-Jin;Seo, Dong-Wan;Choe, Yun-Sik
    • Proceedings of the KIEE Conference
    • /
    • 2003.11b
    • /
    • pp.295-298
    • /
    • 2003
  • In this paper, a novel maximum a posterion (MAP) estimation for the channel decoding of H.264 codes in the presence of transmission error is presented. Arithmetic codes with a forbidden symbol and trellis search techniques are employed in order to estimate the best transmitted. And, there has been growing interest of communication, the research about transmission of exact data is increasing. Unlike the case of voice transmission, noise has a fatal effect on the image transmission. The reason is that video coding standards have used the variable length coding. So, only one bit error affects the all video data compressed before resynchronization. For reasons of that, channel needs the channel codec, which is robust to channel error. But, usual channel decoder corrects the error only by channel error probability. So, designing source codec and channel codec, Instead of separating them, it is tried to combine them jointly. And many researches used the information of source redundancy In received data. But, these methods do not match to the video coding standards, because video ceding standards use not only one symbol but also many symbols in same data sequence. In this thesis, We try to design combined source-channel codec that is compatible with video coding standards. This MAP decoder is proposed by adding semantic structure and semantic constraint of video coding standards to the method using redundancy of the MAP decoders proposed previously. Then, We get the better performance than usual channel coder's.

  • PDF

Active Buffer Management Algorithm for Voice Communication System with Silence Suppression (무음 압축을 이용하는 음성 통신 시스템을 위한 동적 버퍼 관리 알고리즘)

  • Lee, Sung-Hyung;Lee, Hyun-Jin;Kim, Jae-Hyun;Lee, Hyung-Joo;Hoh, Mi-Jeong;Choi, Jeung-Won;Shin, Sang-Heon;Kim, Tae-Wan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.7B
    • /
    • pp.528-535
    • /
    • 2012
  • This paper proposes silence drop first(SDF) active buffer management algorithm to increase the voice capacity when silence suppression is used. This algorithm finds and drops silence packet rather than voice packet in the queue for resolving buffer overflow of queue. Simulations with voice codec of G.729A and G.711 are performed. By using proposed SDF algorithm, the voice capacity is increased by 84.21% with G.729A and 38.46% with G.711. Further more, SDF algorithm reduces the required link capacity and loosens the silence packet inter-arrival time limit to provide target voice quality compared with that of conventional algorithms.

Comparion of Noise Suppression Methods in Voice CODEC (음성코덱에서의 잡음제거 방식 비교)

  • Lee, Jin-Geol
    • The Journal of Engineering Research
    • /
    • v.3 no.1
    • /
    • pp.43-46
    • /
    • 1998
  • Considerable research in the last three decades has examined the problem of enhancement of speech degraded by additive background noise. We compare traditional methods such as spectral subtraction and Wiener filter, recently proposed psychoacoustic model based methods such as perceptual filter and noise suppression in EVRC in terms of performance and complexity.

  • PDF

Improvement of Packet Loss Concealment Algorithm by Using state gain control and fixed codebook estimation (상태별 이득 제어 및 fixed codebook estimation을 이용한 G.729에서의 Packet Loss Concealment 알고리즘 개선)

  • Moon Kwang;Hahn Minsoo
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.109-112
    • /
    • 2003
  • In real time packetized voice applications, missing frames is a major source of voice quality degradation. Thus packet loss concealment(PLC) algorithms are needed to guarantee the QoS of the VoIP. Still current speech codecs for VoIP work poor when consecutive packet losses are issued. In this paper, we proposed a new PLC algorithm for the G.729 codec. Our algorithm works better especially when the consecutive packet loss occurs mainly because it adopts an adaptive gain controller utilizing the number of missing packet information combined with a fixed codebook vector estimation algorithm and LPC bandwidth expansion.

  • PDF

Design and Implementation of a Bluetooth Baseband Module (블루투스 기저대역 모듈의 설계 및 구현)

  • 천익재;오종환;임지숙;김보관
    • Proceedings of the IEEK Conference
    • /
    • 2001.06a
    • /
    • pp.21-24
    • /
    • 2001
  • Bluetooth wireless technology is a publicly available specification proposed for Radio Frequency (RF) communication for short-range and point-to-multipoint voice and data transfer. It operates in the 2.4GHz ISM(Industrial, Scientific and Medical) band and offers the potential for low-cost, broadband wireless access for various mobile and portable devices at range of about 10 meters. In this paper, we describe the structure and the test results of the bluetooth baseband module we have developed. This module has a UART interface for HCI and a audio codec for voice. The interface between controller and this module supports common control interface. An FPGA implementation of this module is tested for file and bit-stream transfers between PCs.

  • PDF