• Title/Summary/Keyword: speech quality

Search Result 807, Processing Time 0.021 seconds

Multiple Average Ratings of Auditory Perceptual Analysis for Dysphonia

  • Choi, Seong-Hee;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.165-170
    • /
    • 2009
  • This study was to investigate for comparison between single rating and average ratings from multiple presentations of the same stimulus for measuring the voice quality of dysphonia using 7-point equal-appearing interval (EAI) rating scale. Overall severity of voice quality for 46 /a/ vowel stimuli (23 stimuli from dysphonia, 23 stimuli from control) was rated by 3 experienced speech-language pathologists (averaged 19 years; range = 7 to 40 years). For average ratings, each stimulus was rated five times in random order and averaged from two to five times. Although higher inter-rater reliability was found in average ratings than in single rating, there were no significant differences in rating scores between single and multiple average ratings judged by experienced listeners, suggesting that auditory perceptual ratings judged by well-trained listeners have relatively good agreement with the same stimulus across the judgment. Larger variations in perceptual ratings were observed for moderate voices than for mild or severe voices, even in the average ratings.

  • PDF

New filter design to replace the post and perceptual weighting filter of transcoder and performance evaluation (상호부호화기의 후처리 필터와 인지가중 필터를 대신하는 새로운 필터 설계 및 성능 평가)

  • 최진규;윤성완;강홍구;윤대희
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2232-2235
    • /
    • 2003
  • In speech communication systems where two different speech codecs are interoperated, transcoding algorithm is a good approach because of its low complexity and improved synthesized speech quality. This paper proposes an efficient method to further improve the performance of transcoding algorithms as well as to reduce the complexity. In the conventional transcoding algorithms. a post-filter and a perceptual weighting filter should be operated sequentially because both decoding and encoding processes are needed. This results in the redundancy of the processing in terms of complexity and perceptual quality. Using the fact that their filter structures are similar, we replaced the two filters with one. The proposed algorithm requires 72.8% lower complexity than the conventional transcoding algorithm when we compare only the complexity of the filtering processes. The results of both objective and subjective tests verify that the proposed algorithm has slightly better quality than the conventional one.

  • PDF

Low-Delay LSF FEC Technique Robust in Lossy VoIP Environment (VoIP 손실 환경에 강인한 저지연 LSF FEC 기법)

  • Yang, Hae-Yong;Lee, Kyung-Hoon;Hwang, In-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.6
    • /
    • pp.687-695
    • /
    • 2002
  • Media-specific FEC techniques, suggested to confront with VoIP speech packet loss, improve speech quality at the expense of generating additional one-frame delay. In this paper, we suggest new media-specific FEC, i.e, LSF FEC technique which is able to improve speech quality with much shortened additional delay. In the proposed technique, the LSF parameters of the future frame are utilized to recover a lost packet. To evaluate performance of the proposed technique, we use ITU-T G.723.1 and G.729 Codec and apply Gilbert packet loss model and estimate MOS per every packet loss rate using PESQ speech quality estimation algorithm. The proposed technique has effect of shortening delay over from 6.5ms to 27ms compared with existing media-specific FEC techniques. Simulation results for comparison of reconstructed speech quality show this novel technique improves the MOS over 0.1 in practical lossy environment of 5 % packet loss rate.

Korean Sentence Symbol Preprocess System for the Improvement of Speech Synthesis Quality (음성 합성 시스템의 품질 향상을 위한 한국어 문장 기호 전처리 시스템)

  • Lee, Ho-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.149-156
    • /
    • 2015
  • In this paper, we propose a Korean sentence symbol preprocessor for a SSML (speech synthesis markup language) supported speech synthesis system in order to improve the quality of the synthesized result. After the analysis of Korean Wikipedia documents, we propose 8 categories for the meaning of sentence symbols and 11 regular expression for the classification of each category. After the development of a Korean sentence symbol preprocess system we archived 56% of precision and 71.45% of recall ratio for 63,000 sentences.

The Invention of Reis Telephone and Its Problem of Speech Quality (라이스의 전화기 발명과 통화 음질의 문제)

  • Ku, Ja-Hyon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.6
    • /
    • pp.395-401
    • /
    • 2010
  • Since Philipp Reis succeeded in sending human voices through electric wires well ahead of Elisha Gray and A. G. Bell etc., he deserves to be acknowledged as the inventor of the telephone. Nevertheless, he did not enjoy any honor for his great invention while he was alive. Since he was working in a scientific community, his work was presented not as a patentable invention but as a scientific discovery. In addition, he used the intermittent electricity in accordance with the experimental tradition in European acoustics, occasioning the speech quality of his telephone to have a fatal shortcoming. On the contrary, Bell, who was a novice in electricity and acoustics, employed variable currents to transmit the sound signals, which guaranteed better speech qualities than Reis's.

Complexity Reduction Algorithm of Speech Coder(EVRC) for CDMA Digital Cellular System

  • Min, So-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.12
    • /
    • pp.1551-1558
    • /
    • 2007
  • The standard of evaluating function of speech coder for mobile telecommunication can be shown in channel capacity, noise immunity, encryption, complexity and encoding delay largely. This study is an algorithm to reduce complexity applying to CDMA(Code Division Multiple Access) mobile telecommunication system, which has a benefit of keeping the existing advantage of telecommunication quality and low transmission rate. This paper has an objective to reduce the computing complexity by controlling the frequency band nonuniform during the changing process of LSP(Line Spectrum Pairs) parameters from LPC(Line Predictive Coding) coefficients used for EVRC(Enhanced Variable-Rate Coder, IS-127) speech coders. Its experimental result showed that when comparing the speech coder applied by the proposed algorithm with the existing EVRC speech coder, it's decreased by 45% at average. Also, the values of LSP parameters, Synthetic speech signal and Spectrogram test result were obtained same as the existing method.

  • PDF

Enhancement of speech with time-variant and colored noise

  • Mine, Katsutoshi;Kitazaki, Masato;Wakabayashi, Katsuyoshi;Morimoto, Yuji
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1990.10b
    • /
    • pp.1098-1102
    • /
    • 1990
  • We consider a method for enhancement of speech signal degraded by additive random noise with time-variant and/or colored natures. For enhancement of speech signal with such noise, it is effective to utilize the natures of speech and noise. The objective of enhancement of speech is to improve the overall quality and the articulation of speech degraded by the time-variant and/or colored random noise. In the proposed method the distribution model of speech spectrum is given as information to noise reduction system. The proposed system can improve about lOdB in SNR when the input SNR is 0 dB.

  • PDF

Enhancement of Speech Using the Adaptive Signal Processing (적응신호처리를 이용한 음질 개선)

  • Shin, Yoon-Ki
    • Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.275-287
    • /
    • 2002
  • In man-machine communication by speech under the noisy environment, the quality of speech may be degraded severely for the machine to recognize correctly. Especially when the corrupting noise occupies the same band as the speech, the conventional fixed filters cannot filter out the noise effectively. In recent, to resolve such a problem adaptive noise canceller (ANC) is frequently used, which is based upon adaptive filters. The Adaptive recursive filters perform better than adaptive nonrecursive filters due to the added poles, but the stability may be severely threatened. In this paper an ANC system employing the adaptive recursive filter is proposed to enhance the speech corrupted by noise. And the stability of the adaptive recursive filter is guaranteed by employing the adaptive compensator.

  • PDF

A Study on the Development of the Real-Time G.723.1 Speech Codec Using a Fixed-Point DSP(ADSP-2181) (고정소수점 DSP(ADSP-2181)을 이용한 실시간 G.723.1 음성부호화기 개발에 관한 연구)

  • Park, Jung-Jae;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.177-186
    • /
    • 1998
  • This paper describes the procedure of implementing a real-time speech codec, G.723.1 which was developed by DSP Group and standardized by ITU-T, using fixed-point DSP, ADSP-2181. This codec has two bit rates associated with it, 5.3 and 6.3 kbit/s. We implemented only one bit rate, 6.3 kbit/s, of the two with fixed-point 32-bit precision. According to the result of the experiment, the amount of computational burden is about 55 MIPS and its quality is similar to the result of the PC simulation with floating-point arithmetic. In this paper, we proposed a method to use a fixed-point DSP and a procedure for developing a real-time speech codec using DSPs and finally developed a G.723.l speech codec for ADSP-2181.

  • PDF

Syllable-Level Smoothing of Model Parameters for HMM-Based Mixed-Lingual Text-to-Speech (HMM 기반 혼용 언어 음성합성을 위한 모델 파라메터의 음절 경계에서의 평활화 기법)

  • Yang, Jong-Yeol;Kim, Hong-Kook
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.87-95
    • /
    • 2010
  • In this paper, we address issues associated with mixed-lingual text-to-speech based on context-dependent HMMs, where there are multiple sets of HMMs corresponding to each individual language. In particular, we propose smoothing techniques of synthesis parameters at the boundaries between different languages to obtain more natural quality of speech. In other words, mel-frequency cepstral coefficients (MFCCs) at the language boundaries are smoothed by applying several linear and nonlinear approximation techniques. It is shown from an informal listening test that synthesized speech smoothed by a modified version of linear least square approximation (MLLSA) and a quadratic interpolation (QI) method is preferred than that without using any smoothing technique.

  • PDF