• Title/Summary/Keyword: Speech quality

Search Result 803, Processing Time 0.026 seconds

Speech Quality Improvement by Speech Quality Evaluation (한국어 음성합성기 성능평가에 의한 합성 음질개선)

  • Yang Hee-Sik;Hahn Minsoo;Kim Jong-Jin
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.37-40
    • /
    • 2002
  • 본 논문에서는 한국어 합성기의 명료도 및 자연성 평가방안에 대한 개략적인 설명과 이 방안을 실제로 2종류의 서로 다른 한국어 합성기에 적용한 결과를 요약하였다. 한편, 이러한 평가결과를 바탕으로 실제로 이루어진 음질 개선 실 예를 소개하는 한편 향후 한국어 합성기의 성능 개선 방향을 제안하였다.

  • PDF

On the Research of a Speech Coder Using a Multi-Level Amplitude Codebook (다중레벨 진폭 코드북을 이용한 음성 부호화기에 관한 연구)

  • 홍성훈;김정진박영호배명진
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1219-1222
    • /
    • 1998
  • This paper analyzes the dynamic spars algebraic codebook used to model a residual signal and proposes a new algebraic codebook structure as well as a searching process with improved performance. The proposed algorithm improves the disadvantage of algebraic codebook without increased computation. First, this paper makes it possibel to select various pulse amplitudes differently from the conventional method which looks up the sign bit simply. In addition, two pulses are made to be selected on the same track. For speech quality on the telephone line 5.6kbps speech coder using the proposed algorithm was equivalent to the 6.3kbps MP-MLQ in the viewpoint of subjective speech quality. However, speech degradation was caused a little compared to the MP-MLQ where MNRU 1=15dB.

  • PDF

An Optimization of Speech Database in Corpus-based speech synthesis sytstem (코퍼스기반 음성합성기의 데이터베이스 최적화 방안)

  • Jang Kyung-Ae;Chung Min-Hwa
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.209-213
    • /
    • 2002
  • This paper describes the reduction of DB without degradation of speech quality in Corpus-based Speech synthesizer of Korean language. In this paper, it is proposed that the frequency of every unit in reduced DB should reflect the frequency of units in Korean language. So, the target population of every unit is set to be proportional to their frequency in Korean large corpus(780K sentences, 45Mega phonemes). Second, the frequent instances during synthesis should be also maintained in reduced DB. To the last, it is proposed that frequency of every instance should be reflected in clustering criterion and used as criterion for selection of representative instances. The evaluation result with proposed methods reveals better quality than using conventional methods.

  • PDF

A Reduction of Speech Database in Corpus-based Speech Synthesis System (코퍼스기반 음성합성기의 데이터베이스 감축방안)

  • Jang Kyung-Ae;Chung Min-Hwa;Kim Jae-In;Koo Myoung-Wan
    • MALSORI
    • /
    • no.44
    • /
    • pp.145-156
    • /
    • 2002
  • This paper describes the reduction of DB without degradation of speech quality in Corpus-based Speech synthesizer of the Korean language. In this paper, it is proposed that the frequency of every unit in reduced DB reflect the frequency of units in the Korean language. So, the target population of every unit is set to be proportional to its frequency in Korean large corpus (780k sentences, 45Mega phones). Secondly, the frequent instances during synthesis should be also maintained in reduced DB. To the last, it is proposed that frequency of every instance be reflected in clustering criteria and used as another important criterion for selection of representative instances. The evaluation result with proposed methods reveals better quality than that using conventional methods.

  • PDF

A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech (연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구)

  • Lee, Si-U
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.4
    • /
    • pp.1299-1304
    • /
    • 2000
  • In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.

  • PDF

Harmonic Peak Picking-based MVF Estimation for Improvement of HMM-based Speech Synthesis System Using TBE Model (TBE 모델을 사용하는 HMM 기반 음성합성기 성능 향상을 위한 하모닉 선택에 기반한 MVF 예측 방법)

  • Park, Jihoon;Hahn, Minsoo
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.79-86
    • /
    • 2012
  • In the two-band excitation (TBE) model, maximum voiced frequency (MVF) is the most important feature of the excitation parameter because the synthetic speech quality depends on MVF. Thus, this paper proposes an enhanced MVF estimation scheme based on the peak picking method. In the proposed scheme, the local peak and the peak lobe are picked from the spectrum of a linear predictive residual signal. The normalized distance between neighboring peak lobes is calculated and utilized as a feature to estimate MVF. Experimental results of both objective and subjective tests show that the proposed scheme improves synthetic speech quality compared with that of the conventional one.

Fixed Point Implementation of the QCELP Speech Coder

  • Yoon, Byung-Sik;Kim, Jae-Won;Lee, Won-Myoung;Jang, Seok-Jin;Choi, Song_in;Lim, Myoung-Seon
    • ETRI Journal
    • /
    • v.19 no.3
    • /
    • pp.242-258
    • /
    • 1997
  • The Qualcomm code excited linear prediction (QCELP) speech coder was adopted to increase the capacity of the CDMA Mobile System (CMS). In this paper, we implemented the QCELP speech coding algorithm by using TMS320C50 fixed point DSP chip. Also the fixed point simulation was done with C language. The computation complexity of QCELP on TMS320C50 was 10k words and data memory was 4k words. In the normal call test on the CMS, where mobile to mobile call test was done in the bypass mode without double vocoding, mean opinion score for the speech quality was he Qualcomm code excited linear prediction (QCELP) speech quality was 3.11.

  • PDF

Coding Method of Variable Threshold Dual Rate ADPCM Speech Considering the Background Noise (배경 잡음환경에서 가변 임계값에 의한 Dual Rate ADPCM 음성 부호화 기법)

  • 한경호
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.17 no.6
    • /
    • pp.154-159
    • /
    • 2003
  • In this paper, we proposed variable threshold dual rate ADPCM coding method which adapts two coding rates of the standard ADPCM of ITU G.726 for speech quality improvement at a comparably low coding rates. The ZCR(Zero Crossing Rate) is computed for speecd data and under the noisy environment, noise data dominant region showed higher ZCR and speech data dominant region showed lower ZCR. The speech data with the higher ZCR is encoded by low coding rate for reduced coded data and the speech data with the lower ZCR is encoded by high coding rate for speech quality improvements. For coded data, 2 bits are assigned for low coding rate of 16[Kbps] and 5 bits are is assigned for high coding rate of 40[Kbps]. Through the simulation, the proposed idea is evaluated and shown that the variable dual rate ADPCM coding technique shows the qood speech quality at low coding rate.

Pruning Methodology for Reducing the Size of Speech DB for Corpus-based TTS Systems (코퍼스 기반 음성합성기의 데이터베이스 축소 방법)

  • 최승호;엄기완;강상기;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.8
    • /
    • pp.703-710
    • /
    • 2003
  • Because of their human-like synthesized speech quality, recently Corpus-Based Text-To-Speech(CB-TTS) have been actively studied worldwide. However, due to their large size speech database (DB), their application is very restricted. In this paper we propose and evaluate three DB reduction algorithms to which are designed to solve the above drawback. The first method is based on a K-means clustering approach, which selects k-representatives among multiple instances. The second method is keeping only those unit instances that are selected during synthesis, using a domain-restricted text as input to the synthesizer. The third method is a kind of hybrid approach of the above two methods and is using a large text as input in the system. After synthesizing the given sentences, the used unit instances and their occurrence information is extracted. As next step a modified K-means clustering is applied, which takes into account also the occurrence information of the selected unit instances, Finally we compare three pruning methods by evaluating the synthesized speech quality for the similar DB reduction rate, Based on perceptual listening tests, we concluded that the last method shows the best performance among three algorithms. More than this, the results show that the last method is able to reduce DB size without speech quality looses.

The evaluation of methodological quality of meta-analysis studies in speech language pathology using AMSTAR (AMSTAR에 기반한 국내 언어치료 분야 메타분석 논문의 방법론적 질평가)

  • Han, Minju;Byeon, Haewon
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.2
    • /
    • pp.161-165
    • /
    • 2020
  • Although research using meta-analysis is increasing in the field of rehabilitation science, not all meta-analytical papers are of the same quality. In particular, although meta-analysis is a research method with the highest level of evidence, it may be possible to derive distorted conclusions or alternatives by simply integrating representative values without considering heterogeneity among individual studies. This study analyzed the current status of meta-analysis papers on the subject of language arbitration published in Korea from January 2010 to June 2019, and used A Measurement Tool to Assess the Methodological Quality of Systematic Review (AMSTAR). As a result of evaluating the methodological quality of the final five papers, the average of 7.4 points out of 11 points was found above average. In order to raise the qualitative level of Meta-analysis in speech-language pathology in the future, it is necessary to include verification of publication bias and specification of conflicts of interest.