• Title/Summary/Keyword: Speech quality

Search Result 803, Processing Time 0.033 seconds

A Correlation Study among Pitch, Nasalance, and Voice Quality (정상 성인의 음도, 비성도, 음질 간의 상관 연구)

  • Park, Sung-Jong;Yoo, Jae-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.159-163
    • /
    • 2009
  • The purpose of this study is to conduct a correlational analysis among pitch, nasalance, and acoustic quality parameters estimated by two speech analysis softwares NasalView(version 1.31), Dr. Speech 4.5(Tiger Electronics). Thirty females and 25 males with normal voice participated in the study. The Pearson correlation coefficient was determined through a statistical analysis. The results came out as follows; Firstly, there was a correlation between $F_0$ and voice quality parameters, however there was no correlation between $F_0$ and nasalance. Secondly, nasalance showed a correlation with voice quality parameters.

  • PDF

EVRC Speech Quality Enhancement Using Pitch Prediction and Gradual Increase of the Decoded Speech (피치예측과 점진적 복원 기법을 이용한 EVRC 음질개선)

  • 민병준;김재원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.6
    • /
    • pp.38-43
    • /
    • 1999
  • The EVRC vocoder is a toll quality coder, but it shows significant degradation or the quality in weak RF environment. In this paper, the speech quality degradation phenomenon of the EVRC is analyzed, and two methods are proposed as the solution - the pitch prediction and the gradual increase. The preference tests for various Rf environment are performed for speech quality assessments and both the methods show better performance.

  • PDF

Speech Database Design and Structuring for High Quality TTS (고품질 음성합성을 위한 합성 DB 구축)

  • Kang Dong-Gyu;Yi Sionghun;Ryu Won-Ho
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.33-36
    • /
    • 2002
  • As the telematics service that is the integration of information technology approaches commercialization, the necessity and gravity of speech technology is rapidly growing. The speech technology occupies important position in the telematics service because it informs the starting of service and the retrieved result. This service must provide high accuracy of speech recognition and natural synthesis of human speech in a driving environment and it is especially true for the fee-for-service. For high quality TTS, the speech synthesis technique that makes optimal synthesis database and uses efficiently this database is required. In this paper, we describe the design of phonetically balanced sentences used for speech database, the selection of service-suitable-speaker, the extraction methods of accurate phoneme boundary, and the factors which are taken into consideration in the extraction stage of prosody. Finally we show the real case that has commercially implemented.

  • PDF

Performance Evaluation of Frame Erasure Concealment Algorithms in VoIP Coders (VoIP 코더들의 프레임손실은닉 알고리즘 성능평가)

  • Han, Seung-Ho;Moon, Kwang;Han, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.235-238
    • /
    • 2004
  • Frame erasures cause speech quality degradation in wireless communication networks or packet networks. The degradation becomes worse when consecutive frame erasures occur. Speech coders have a frame erasure concealment(FEC) mechanism to compensate for frame erasures. It is meaningful to evaluate the performance of FEC mechanisms for frame erasures that occur in communications networks. In this paper, various frame erasures are designed. And the FEC algorithms of speech coders are evaluated and analyzed with the Perceptual Evaluation of Speech Quality(PESQ). It is found that the performances vary in accordance with frame erasure types, frame erasure rates, and utterance lengths.

  • PDF

Bandwidth Expansion Method Using Spline Codebook Based Spectral Folding (Spline 코드북 기반의 spectral folding을 이용한 대역폭 확장 방법)

  • Park, Ji-Hoon;Han, Seung-Ho;Yang, Hee-Sik;Jeong, Sang-Bae;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.131-134
    • /
    • 2006
  • Quality of narrowband speech $(0{\sim}4kHz)$ can be enhanced by the bandwidth expansion technique, by which the high- band components are estimated. This paper proposes the bandwidth expansion method using the spline codebook based spectral folding. For the performance evaluation, the PESQ(Perceptual Evaluation of Speech Quality) scores are measured as the objective measurement In addition, the MOS (Mean Opinion Score) and the preference tests are performed as the subjective measurement. The results show our proposed method outperforms the existing spline based one.

  • PDF

Experiment of VoIP Transmission with AMR Speech Codec in Wireless LAN (무선랜 환경에서 AMR 음성부호화기를 적용한 VoIP 전송 실험)

  • Shin, Hye-Jung;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.67-73
    • /
    • 2004
  • Packet loss, jitter, and delay in the Internet are caused mainly by the shortage of network bandwidth. It is due to queuing and routing process in the intermediate nodes of the packet network. In the Internet whose bandwidth is changing very rapidly in time depending on the number of users and data traffic, controlling the peak transmission bit-rate of a VoIP. system depending on the channel condition could be very helpful for making use of the available network bandwidth. Adapting packet size to the channel condition can reduce packet loss to improve the speech quality. It has been shown in [1] that a VoIP system with an AMR speech codec provides better speech quality than VoIP systems with fixed rate speech codecs. With the adaptive codec mode assignment. algorithm proposed in [1], in this paper, we performed the voice transmission experiments using the wireless LAN through the real Internet environment. Experimental results are analyzed and discussed with our findings.

  • PDF

Synthetic Speech Quality Improvement By Glottal parameter Interpolation - Preliminary study on open quotient interpolation in the speech corpus - (성대특성 보간에 의한 합성음의 음질향상 - 음성코퍼스 내 개구간 비 보간을 위한 기초연구 -)

  • Bae, Jae-Hyun;Oh, Yung-Hwa
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.63-66
    • /
    • 2005
  • For the Large Corpus based TTS the consistency of the speech corpus is very important. It is because the inconsistency of the speech quality in the corpus may result in a distortion at the concatenation point. And because of this inconsistency, large corpus must be tuned repeatedly One of the reasons for the inconsistency of the speech corpus is the different glottal characteristics of the speech sentence in the corpus. In this paper, we adjusted the glottal characteristics of the speech in the corpus to prevent this distortion. And the experimental results are showed.

  • PDF

Multi-channel input-based non-stationary noise cenceller for mobile devices (이동형 단말기를 위한 다채널 입력 기반 비정상성 잡음 제거기)

  • Jeong, Sang-Bae;Lee, Sung-Doke
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.7
    • /
    • pp.945-951
    • /
    • 2007
  • Noise cancellation is essential for the devices which use speech as an interface. In real environments, speech quality and recognition rates are degraded by the auditive noises coming near the microphone. In this paper, we propose a noise cancellation algorithm using stereo microphones basically. The advantage of the use of multiple microphones is that the direction information of the target source could be applied. The proposed noise canceller is based on the Wiener filter. To estimate the filter, noise and target speech frequency responses should be known and they are estimated by the spectral classification in the frequency domain. The performance of the proposed algorithm is compared with that of the well-known Frost algorithm and the generalized sidelobe canceller (GSC) with an adaptation mode controller (AMC). As performance measures, the perceptual evaluation of speech quality (PESQ), which is the most widely used among various objective speech quality methods, and speech recognition rates are adopted.

A Half Rate Speech Soder using Trellis Excitation (Trellis excitation을 이용한 half rate 음성부호화기)

  • 강상원;이형수;김영수;정진욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.2
    • /
    • pp.88-94
    • /
    • 1996
  • In this paper, we present a half rate speech coder using trellis excitation. The coder combines code-excited linear prediction (CELP) system and trellis quantization method using the codebook expansion, and it produces higher speech quality than the typical CELP coder for the same transmission rate. A subjective comparison with 3~8 bit .$\mu$-law PCM indicates that the half rate coder provides speech quality between 5-bit and 6-bit $\mu$-law PCM .

  • PDF

Implementation of Wideband Waveform Interpolation Coder for TTS DB Compression (TTS DB 압축을 위한 광대역 파형보간 부호기 구현)

  • Yang, Hee-Sik;Hahn, Min-Soo
    • MALSORI
    • /
    • v.55
    • /
    • pp.143-158
    • /
    • 2005
  • The adequate compression algorithm is essential to achieve high quality embedded TTS system. in this paper, we Propose waveform interpolation coder for TTS corpus compression after many speech coder investigation. Unlike speech coders in communication system, compression rate and anality are more important factors in TTS DB compression than other performance criteria. Thus we select waveform interpolation algorithm because it provides good speech quality under high compression rate at the cost of complexity. The implemented coder has bit rate 6kbps with quality degradation 0.47. The performance indicates that the waveform interpolation is adequate for TTS DB compression with some further study.

  • PDF