• Title/Summary/Keyword: 음질평가

Search Result 353, Processing Time 0.021 seconds

An ACLMS-MPC Coding Method Integrated with ACFBD-MPC and LMS-MPC at 8kbps bit rate. (8kbps 비트율을 갖는 ACFBD-MPC와 LMS-MPC를 통합한 ACLMS-MPC 부호화 방식)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • This paper present an 8kbps ACLMS-MPC(Amplitude Compensation and Least Mean Square - Multi Pulse Coding) coding method integrated with ACFBD-MPC(Amplitude Compensation Frequency Band Division - Multi Pulse Coding) and LMS-MPC(Least Mean Square - Multi Pulse Coding) used V/UV/S(Voiced / Unvoiced / Silence) switching, compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. In integrating several methods, it is important to adjust the bit rate of voiced and unvoiced sound source to 8kbps while reducing the distortion of the speech waveform. In adjusting the bit rate of voiced and unvoiced sound source to 8 kbps, the speech waveform can be synthesized efficiently by restoring the individual pitch intervals using multi pulse in the representative interval. I was implemented that the ACLMS-MPC method and evaluate the SNR of APC-LMS in coding condition in 8kbps. As a result, SNR of ACLMS-MPC was 15.0dB for female voice and 14.3dB for male voice respectively. Therefore, I found that ACLMS-MPC was improved by 0.3dB~1.8dB for male voice and 0.3dB~1.6dB for female voice compared to existing MPC, ACFBD-MPC and LMS-MPC. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or internet phone. In the future, I will study the evaluation of the sound quality of 6.9kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source.

Improving a Sound Localization Using 1/3-octave Band Pass Filter (1/3-옥타브 대역통과필터를 이용한 음상정위기법 성능 향상)

  • Hwang, Shin;Yang, Jin-Woo;Cheung, Wan-Sup;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.98-103
    • /
    • 2001
  • The binaural auditory system of human has the capability of differentiating the direction and distance of sound sources. This feature is well characterised in terms of the inter-aural intensity difference (IID), the inter-aural time difference (ITD) and/or the spectral shape difference (SSD) arising from the acoustic transfer of a sound source to the outer ears. This paper proposes an effective way of extracting the three sound perception factors (IID, ITD, SSD) from the head-related transfer functions (HRTF's) that depends on the direction and distance of the acoustic source from the listener. It includes the estimation method of the equivalent ITD and 1/3-octave band-based IID factors and their usage to locate a sound source in space. Subjective and objective tests were carried out to examine the effectiveness of the proposed methodology and its applicability to real sound systems. Those experimental results are illustrated in this paper.

  • PDF

Enhancement of Super-wideband Coder by Considering Audio Feature in MDCT Domain (MDCT 도메인에서 오디오 신호 특징을 고려한 초광대역 코덱 개선)

  • Hong, Ki-Bong;Jeong, Gyu-Hyeok;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.129-136
    • /
    • 2011
  • This paper presents the coding method that have multi-mode and efficiency of audio codecs using the feature of audio signal. Recently, the developed extension super-wideband codec based on G.718 wideband divides two mode between Generic and Sinusiodal. So codec efficently encode audio signal exist in super-wideband. But the codec is not as efficent coding for harmonic component of wind instrument and string instrument and individual-Line component of percussion instrument. The proposed method are modeling and encoding multiple pitch and individual-line feature using multi mode coding. For the performance evaluation, we used SNR in MDCT domain for objective test and MUSHRA test for subjective test. As a result, the performance of SNR and MUSHRA test of the proposed method have better performance than the G.718 super-wideband codec.

An IP Based Transcript System in VoIP Network (VoIP망에서 IP기반 녹취 시스템 설계 및 구현)

  • Son Min-ho;Kim Soo-hee;Kim Young-ung;Jung In-hwan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.898-900
    • /
    • 2005
  • 초고속 통신망의 확대 적용으로 인터넷의 빠른 성장과 함께 음성과 비디오 그리고 데이터를 통합하고자 하는 노력이 시도 되고 있다. VoIP(Voice over IP)는 IP를 이용하여 음성과 데이터를 패킷 형태로 통합하여 실시간으로 전송하는 기술이다[1]. 패킷 네트워크에서 VoIP 시그널링 기술을 이용하면 망 자원의 효율적 이용 및 PSTN에 가까운 음질 그리고 인터넷과 연계한 다양한 음성서비스 지원(H.323, SIP, MGCP 등 다양한 신호처리 지원)이 가능하다. 본 논문에서는 VoIP망에서 IP기반 녹취 시스템을 설계 및 구현한다. 녹취 시스템은 고객과 상담원의 통화 내용을 자동으로 녹음하여 보관함으로써 고객의 요구사항을 명확히 파악할 수 있으며 녹취 데이터의 통계 자료 제공으로 효율적인 관리가 지원되고 선택 녹취, 스케줄링 녹취, 상담원의 평가 자료를 제공하여 고객 관리의 질적인 향상을 지원한다. 본 논문의 녹취 시스템은 고객과의 통화 내용을 녹취하여 서버의 녹취 DB에 저장하여 관리하는 녹취 시스템으로 모든 네트워크 환경에서 사용할 수 있으며 CTI와 연동하여 효율적이고 체계적인 녹취 시스템 구국이 가능하다.

  • PDF

A Study on LMS-MPC Method Considering Low Bit Rate (Low Bit Rate을 고려한 LMS-MPC 방식에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.10 no.5
    • /
    • pp.233-238
    • /
    • 2012
  • In a speech coding system using excitation source of voiced and unvoiced, it would be a distortion of speech waveform in case of exist a voiced and an unvoiced consonants in a frame. To solve this problem, this paper present a method of LMS-MPC uses individual pitch and LMS(Least Mean Square). I evaluate the MPC and LMS-MPC using LMS. As a result, SNRseg of LMS-MPC was improved 1.5dB for female voice and 1.3dB for male voice respectively. Compared to the MPC, SNRseg of LMS-MPC has been improved that I was able to control the distortion of the speech waveform finally. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

Development of Text-to-Speech System for PC (PC용 Text-to-Speech 시스템 개발)

  • Choi Muyeol;Hwang Cholgyu;Kim Soontae;Kim Junggon;Yi Sopae;Jang Seokbok;Pyo Kyungnan;Ahn Hyesun;Kim Hyung Soon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.41-44
    • /
    • 1999
  • 본 논문에서는 PC 응용을 위한 고음질의 한국어 text-to-speech(TTS) 합성 시스템을 개발하였다. 개발된 시스템의 합성방식으로는 음의 고저 조절, 인접음 사이의 연결 처리 및 음색제어 등에서 기존의 PSOLA 방식에 비해 장점을 가지는 정현파 모델 기반의 방식을 채택하였고, 자연스러운 운율 모델링을 위하여 통계적 기법중의 하나인 Classification and regression tree(CART) 방법을 사용하였다. 또한 음소 경계의 불연속성 문제를 줄이기 위한 합성단위로 초성-중성 및 종성 단위를 사용하였고, 다양한 음색표현이 가능하도록 음색제어 기능을 갖추었다. 그리고, 표준 Speech Application Program Interface(SAPI)를 준용한 TTS engine 형태로 구현함으로써 PC 상에서의 응용 프로그램 개발 편의성을 높였다. 합성음의 청취평가 결과 음질의 우수성 및 음색제어 기능의 유효성을 확인할 수 있었다.

  • PDF

Artificial Bandwidth Extension Based on Harmonic Structure Extension and NMF (하모닉 구조 확장과 NMF 기반의 인공 대역 확장 기술)

  • Kim, Kijun;Park, Hochong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.12
    • /
    • pp.197-204
    • /
    • 2013
  • In this paper, we propose a new method for artificial bandwidth extension of narrow-band signal in frequency domain. In the proposed method, a narrow-band signal is decomposed into excitation signal and spectral envelope, which are extended independently in frequency domain. The excitation signal is extended such that low-band harmonic structure is maintained in high band, and the spectral envelope is extended based on sub-band energy using NMF. Finally, the spectral phase is determined based on signal correlation between frames in time domain, resulting in the final wide-band signal. The subjective evaluation verified that the wide-band signal generated by the proposed method has a higher quality than the original narrow-band signal.

The Assessment on the Sound Quality of Reduced Frequency Selectivity of Hearing Impaired People (난청인의 주파수 선택도 둔화현상이 음질에 미치는 영향 평가)

  • An, Hong-Sub;Park, Gyu-Seok;Jeon, Yu-Yong;Song, Young-Rok;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.6
    • /
    • pp.1196-1203
    • /
    • 2011
  • The reduced frequency selectivity is a typical phenomenon of sensorineural hearing loss. In this paper, we compared two modeling methods for reduced frequency selectivity of hearing impaired people. The two models of reduced frequency selectivity were made using LPC(linear prediction coding) algorithm and bandwidth control algorithm based on ERB(equivalent rectangular bandwidth) of auditory filter, respectively. To compare the effectiveness of two models, we compared the result of PESQ (perceptual evaluation of speech quality) and LLR(log likelihood ratio) using 36 Korean words of two syllables. To verify the effect on noise condition, we mixed white and babble noise with 0dB and -3dB SNR to speech words. As the result, it is confirmed that the PESQ score of bandwidth control algorithm is higher than the score of LPC algorithm, on the other hands, and the LLR score of LPC algorithm is lower than the score of bandwidth control algorithm. It means that both non-linearity and widen auditory filter characteristics caused by reduced frequency selectivity could be more reflected in bandwidth control algorithm than in LPC algorithm.

Robust, Low Delay Multi-tree Speech Coding at 9.6Kbits/sec (견실, 저지연 멀티트리 9.6Kbits/s 음성부호기에 관한 연구)

  • 우홍체;문병현;이채욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.3
    • /
    • pp.348-354
    • /
    • 1993
  • In this research, a multi-tree coder at 9.6Kbits/sec using a novel scheme for adaptation of the short-term coefficients is developed. The overall delay of the tree coder is maintained at 2.5 msec(16 samples at the 6.4KHz sampling frequency). This coder produces good quality speech over ideal channels, and it is very robust to channel errors up to a bit error rate (BER) of $10^{-3}$. This robustness is achieved by using a parallel adaptation scheme in combination with the use of a smoothed version of the received excitation sequence for adaptation of the short-term prediction coefficients. For the multi-tree coder, reconstructed output speech is evaluated using signal-to-quantization noise ratios (SNR), segmental SNRs, and informal listening tests.

  • PDF

Enhanced Amplitude Panning for Virtual Source Imaging (가상 음원 이미징을 위한 향상된 진폭 패닝 기법)

  • Hyun, Dong-Il;Park, Young-Cheol;Youn, Dae Hee
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.3
    • /
    • pp.139-145
    • /
    • 2013
  • In this paper, the problems of the conventional amplitude panning method for a stereophonic panning system are analyzed. We observed that the distortion showed a feedforward comb filter response. As a remedy to this distortion, we propose a stereophonic panning system using a feedback comb filter. The comb filter is designed to minimize the difference between interaural level difference(ILD) of the proposed system and that of HRTF because ILD is most salient cue for the perception of the source direction. The proposed system is configured to operate selectively for the frequency band related to the source direction. The performance of the proposed system is verified by subjective listening tests.