• Title/Summary/Keyword: Speech quality

Search Result 805, Processing Time 0.026 seconds

A Study on the Comparison of Digital Speech Coding Performance (디지털 음성방식의 성능 비교에 대한 연구)

  • 배철수
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.8
    • /
    • pp.881-890
    • /
    • 1992
  • Resonable speech quality assessment methodologies are required for speech quality assessment model which is used at speech system and communication network. There are objectlve measuies and subjective measures and subjective measure has the variousproblems in speech quality assessment methodologies. The objective of this study is to compare objective measures with subjective measure and obtain the objective measure as close as possible to subjective measure.

  • PDF

Design of a variable rate speech codec for the W-CDMA system (W-CDMA 시스템을 위한 가변율 음성코덱 설계)

  • 정우성
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.142-147
    • /
    • 1998
  • Recently, 8 kb/s CS-ACELP coder of G.729 is atandardized by ITU-T SG15 and it has been reported that the speech quality of G729 is better than or equal to that of 32kb/s ADPCM. However G.729 is the fixed rate speech coder, and it does not consider the property of voice activity in mutual conversation. If we use the voice activity, we can reduce the average bit rate in half without any degradations of the speech quality. In this paper, we propose an efficient variable rate algorithm for G.729. The variable rate algorithm consists of two main subjects, the rate determination algorithm and algorithm, we combine the energy-thresholding method, the phonetic segmentation method by integration of various feature parameters obtained through the analysis procedure, and the variable hangover period method. Through the analysis of noise features, the 1 kb/s sub rate coder is designed for coding the background noise signal. So, we design the 4 kb/s sub rate coder for the unvoiced parts. The performance of the variable rate algorithm is evaluated by the comparison of speed quality and average bit rate with G.729. Subjective quality test is also done by MOS test. Conclusively, it is verified that the proposed variable rate CS-ACELP coder produced the same speech quality as G.729, at the average bit rate of 4.4 kb/s.

  • PDF

Enhanced source controlled variable bit-rate scheme in a waveform interpolation coder (Source controlled variable bit-rate scheme을 이용한 파형 보간 부호화기의 음질 개선 기법)

  • Cho, Keun-Seok;Yang, Hee-Sik;Jeong, Sang-Bae;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.315-318
    • /
    • 2007
  • This paper proposes the methods to enhance the speech quality of source controlled variable bit-rate coder based on the waveform interpolation. The methods are to estimate and generate the parameters that are not transmitted from encoder to decoder by the repetition and extrapolation schemes. For the performance evaluation, the PESQ(Perceptual Evaluation of Speech Quality) scores are measured. The experimental results shows that our proposed method outperforms the conventional source controlled variable bit-rate coder. Especially, the performance of the extrapolation method is better than that of the repetition method.

  • PDF

A STUDY ON THE SPEECH SYNTHESIS-BY-RULE SYSTEM APPLIED MULTIBAND EXCITATION SIGNAL

  • Kyung, Younjeong;Kim, Geesoon;Lee, Hwangsoo;Lee, Yanghee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1098-1103
    • /
    • 1994
  • In this paper, we design and implement the Korean speech synthesis by rule system. This system is applied the multiband excitation signal on voiced sounds. The multiband excitation signal is obtained by mixing impluse spectrum and which noise spectrum. We find that the quality of synthesized speech is improved using this application. Also, we classify the voiced sounds by cepstral euclidian distance measure for reducing overhead memory. The representative excitation signal of the same group's voiced sounds is used as excitation signal on synthesis. This method does not affect the quality of synthesized speech. As the result of experiment, this method eliminates the "buzziness" of synthesized speech and reduces the spectral distortion of synthesized speech.ed speech.

  • PDF

Transmission of Channel Error Information over Voice Packet (음성 패킷을 이용한 채널의 에러 정보 전달)

  • 박호종;차성호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.394-400
    • /
    • 2002
  • In digital speech communications, the quality of service can be increased by speech coding scheme that is adaptive to the error rate of voice packet transmission. However, current communication protocol in cellular and internet communications does not provide the function that transmits the channel error information. To solute this problem, in this paper, new method for real-time transmission of channel error information is proposed, where channel error information is embedded in voice packet. The proposed method utilizes the pulse positions of codevector in ACELP speech codec, which results in little degradation in speech quality and low false alarm rate. The simulations with various speech data show that the proposed method meets the requirement in speech quality, detection rate, and false alarm rate.

A Study on Voice Communication Quality Criteria Under Mobile-VoIP Environments

  • Choi, Jae-Hun;Seol, Soon-Uk;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2E
    • /
    • pp.35-42
    • /
    • 2009
  • In this paper, we present criteria of objective measurement of speech quality to provide the mobile-VoIP services efficiently over wireless mobile internet. The mobile-VoIP service, which is based on mobility and is error-prone compared to conventional VoIP over wired network, is about to be launched, but there have not been adequate quality indexes and the Quality of Service (QoS) standards for evaluating speech quality of Mobile-VoIP. In addition, there are many factors influencing on the speech quality in packet network of which packet loss contribute directly to the overall voice communication quality. For this reason, we adopt the Gilbert-Elliot Channel Model for modeling packet network based on IP and assess the voice quality through the objective speech method of ITU-T P. 862 PESQ and ITU-T P. 862.1 MOS-LQO under various packet loss rates in the transmission channel environments. Our simulation results address the specific criteria and QoS for the mobile-VoIP services in terms of the various packet loss environments.

Non-Intrusive Speech Quality Estimation of G.729 Codec using a Packet Loss Effect Model (G.729 코덱의 패킷 손실 영향 모델을 이용한 비 침입적 음질 예측 기법)

  • Lee, Min-Ki;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.157-166
    • /
    • 2013
  • This paper proposes a non-intrusive speech quality estimation method considering the effects of packet loss to perceptual quality. Packet loss is a major reason of quality degradation in a packet based speech communications network, whose effects are different according to the input speech characteristics or the performance of the embedded packet loss concealment (PLC) algorithm. For the quality estimation system that involves packet loss effects, we first observe the packet loss of G.729 codec which is one of narrowband codec in VoIP system. In order to quantify the lost packet affects, we design a classification algorithm only using speech parameters of G.729 decoder. Then, the degradation values of each class are iteratively selected that maximizes the correlation with the degradation PESQ-LQ scores, and total quality degradation is modeled by the weighted sum. From analyzing the correlation measures, we obtained correlation values of 0.8950 for the intrusive model and 0.8911 for the non-intrusive method.

The Smoothing Method of the Concatenation Parts in Speech Waveform by using the Forward/Backward LPC Technique (전, 후방향 LPC법에 의한 음성 파형분절의 연결부분 스므딩법)

  • 이미숙
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1991.06a
    • /
    • pp.15-20
    • /
    • 1991
  • In a text-to-speech system, sound units (e. q., phonemes, words, or phrases) can be concatenated together to produce required utterance. The quality of the resulting speech is dependent on factors including the phonological/prosodic contour, the quality of basic concatenation units, and how well the units join together. Thus although the quality of each basic sound unit is high, if occur the discontinuity in the concatenation part then the quality of synthesis speech is decrease. To solve this problem, a smoothing operation should be carried out in concatenation parts. But a major problem is that, as yet, no method of parameter smoothing is availalbe for joining the segment together.

  • PDF

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.544-551
    • /
    • 2023
  • Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.

Characteristics of speech intelligibility and speech acceptability connected with mouth opening condition (구강 개방 상태에 따른 말 명료도 및 말 용인도 특성)

  • Song, Yun-Kyung
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.141-148
    • /
    • 2011
  • There are many factors that affect speech intelligibility and speech acceptability. Structural anomalies and neuromotor pathologies are known for the reasons of abnormal speech sounds. And there are minor variations related to oral mechanism. Speaking with restricted mouth opening related to therapeutic procedure or habitual speech pattern might affect the quality of speech sounds. So this study compared speech intelligibility and speech acceptability of recorded 24 words in two conditions (restricted mouth opening condition and normal mouth opening condition) by 30 normal hearing adults. The results showed that speech intelligibility and speech acceptability were significantly lower in restricted mouth opening condition. And speech acceptability was significantly lower than speech intelligibility in restricted mouth opening condition. Speech acceptability in restricted mouth opening condition was significantly lower especially in open vowel. These findings indicated that the mouth opening condition could affect vowel shape and could be an adverse effect on speech intelligibility and speech acceptability.

  • PDF