• 제목/요약/키워드: Voiced

검색결과 282건 처리시간 0.023초

인도네시아어의 파열음의 발성유형 연구 (A Study of Phonation Types of the Plosives in Bahasa Indonesia)

  • 전태현;박한상
    • 대한음성학회지:말소리
    • /
    • 제52호
    • /
    • pp.15-48
    • /
    • 2004
  • The present study investigates phonation types of the plosives in Bahasa Indonesia in terms of VOT, F0, durations of intervocalic closure, the preceding vowel, and the following vowel. The results showed that two speaker groups have distinct phonation types. Speaker Group I was characterized by a short voice lag for voiceless plosives and a considerable amount of voice lead for voiced ones. Speaker Group II was characterized by a short lag for both voiceless and voiced plosives. Although both groups showed a significant difference in F0 and the durations of individual segments between voiceless and voiced plosives, they had a remarkable difference in the temporal structure of the segments. Speaker Group I had temporal compensation between the intervocalic closure and the surrounding vowels across voice, such that the shorter the intervocalic closure the longer the surrounding vowels, while Speaker Group 2 didn't. This means that there are two different phonation type systems within a language.

  • PDF

A Noise Reduction Method with Linear Prediction Using Periodicity of Voiced Speech

  • Sasaoka, Naoto;Kawamura, Arata;Fujii, Kensaku;Itoh, Yoshio;Fukui, Yutaka
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -1
    • /
    • pp.102-105
    • /
    • 2002
  • A noise reduction technique to reduce background noise in corrupted voice is proposed. The proposed method is based on linear prediction and takes advantages of periodicity of voiced speech. A voiced sound is regarded as a periodic stationary signal in short time interval. Therefore, the current voice signal is correlated with the voice signal delayed by a pitch period. A linear predictor can estimate only the current signal correlated with the delayed signal. Therefore, the enhanced voice can be obtained as output of the linear predictor. Simulation results show that the proposed method is able to reduce the background noise.

  • PDF

다양한 수준의 한국인 영어 학습자의 영어 파열음의 구간 신호 지각 연구 (A Perceptual Study of the Temporal Cues of English Plosives for Leveled Groups of Korean English Learners)

  • 강석한;박한상
    • 대한음성학회지:말소리
    • /
    • 제56호
    • /
    • pp.49-73
    • /
    • 2005
  • This study explores the most important temporal cues in the perception of the voiced/voiceless distinction of English plosives in terms of newly defined measures of perception: original signal to response agreement, unit signal to response agreement, and robustness. Seven native speakers of English and three leveled groups of Korean English learners participated in the present study. The results showed that both native speakers of English and Korean groups failed to successfully perceive the voiced/voiceless distinction of English plosives, particularly alveolar plosives, in word-medial trochaic positions. The results also showed that in word-initial and word-medial iambic positions both native speakers of English and Korean groups employ the information in the release burst and aspiration in the perception of the voiced/voiceless distinction, of English plosives, and that in word-final positions native speakers of English employ the information in the preceding vowel, while Korean groups employ the information in the closure interval.

  • PDF

Detection and Synthesis of Transition Parts of The Speech Signal

  • Kim, Moo-Young
    • 한국통신학회논문지
    • /
    • 제33권3C호
    • /
    • pp.234-239
    • /
    • 2008
  • For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.

영어 모음 발음 교육이 한국인 학습자의 어두 폐쇄음 발화에 미치는 영향에 대한 연구 (A Study on the Influence of English Vowel Pronunciation Training on Word Initial Stop Pronunciation of Korean English Learners)

  • 김지은
    • 말소리와 음성과학
    • /
    • 제5권3호
    • /
    • pp.31-38
    • /
    • 2013
  • This study investigated the influence of English vowel pronunciation training to English word-initial stop pronunciation. For that purpose, VOT values of English stops produced by twenty Korean English learners(five Youngnam dialect male speakers, five Youngnam dialect female speakers, five Kangwon dialect male speakers, and five Kangwon dialect female speakers) were measured using the Speech Analyzer and their post-training production was compared with their pre-training production. The result shows that post-training VOT values of voiced stops became closer to those of native English speakers in all four groups. Hence, it can be inferred that vowel pronunciation training is effective for correcting pronunciation of voiced vowels by analyzing the change of the quality of following vowels(especially low vowels) and the degree of giving stress.

서반아어 자음에 대한 음성학적 연구 -한국인의 서반아어 자음습득 과정을 중심으로- (A Phonetic Study of Spanish Consonants - On the Process of Koreans' Spanish Consonants Acquisition-)

  • 박지영
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.409-414
    • /
    • 1996
  • The aim of this paper is to research on the actual condition of Koreans' Spanish consonants pronunciation with an emphasis on describing the phonetic different of Korean speakers and Spanish speakers. 40 Spanish words were chosen for the speech sampling, and 10 Spanish majoring Korean students from Seoul or Kyunggi Province and 3 Spanish speakers form Castile, Spain participated in the interview. The most noticeable phonetic differences of Korean speakers' pronunciation comparing with Spanish speakers are abstracted as follows: 1) The voiced stops are pronounced voiceless or weak voiced. 2) The voiced stops are slightly aspirated. 3) The length of voiceless consonants is quite longer than the length of proceeding vowel. 4) Fricatives and affricates are somewhat fronter, and weaker in the degree of friction. 5) There is a strong tendency to geminate dental lateral /l/ such as 'pelo' and to vocalize palatal lateral /$\rightthreetimes$/ such as 'calle' 6) Unlike in Spanish speech flap $\mid$r$\mid$ and trill [r] are pronounced similarly in Korean speech.

  • PDF

Asymmetric effects of speaking rate on the vowel/consonant ratio conditioned by coda voicing in English

  • Ko, Eon-Suk
    • 말소리와 음성과학
    • /
    • 제10권2호
    • /
    • pp.45-50
    • /
    • 2018
  • The vowel/consonant ratio is a well-known cue for the voicing of postvocalic consonants. This study investigates how this ratio changes as a function of speaking rate. Seven speakers of North American English read sentences containing target monosyllabic words that contrasted in coda voicing at three different speaking rates. Duration measures were taken for the voice onset time (VOT) of the onset consonant, the vowel, and the coda. The results show that the durations of the onset VOT and vowel are longer before voiced codas, and that the durations of all segments increase monotonically as speaking rate decreases. Importantly, the vowel/consonant ratio, a primary acoustic cue for coda voicing, was found to pattern asymmetrically for voiced and voiceless codas; it increases for voiced codas but decreases for voiceless codas with the decrease in speaking rate. This finding suggests that there is no stable ratio in the duration of preconsonantal vowels that is maintained in different speaking styles.

Enhanced Maximum Voiced Frequency Estimation Scheme for HTS Using Two-Band Excitation Model

  • Park, Jihoon;Hahn, Minsoo
    • ETRI Journal
    • /
    • 제37권6호
    • /
    • pp.1211-1219
    • /
    • 2015
  • In a hidden Markov model-based speech synthesis system using a two-band excitation model, a maximum voiced frequency (MVF) is the most important feature as an excitation parameter because the synthetic speech quality depends on the MVF. This paper proposes an enhanced MVF estimation scheme based on a peak picking method. In the proposed scheme, both local peaks and peak lobes are picked from the spectrum of a linear predictive residual signal. The average of the normalized distances of local peaks and peak lobes is calculated and utilized as a feature to estimate an MVF. Experimental results of both objective and subjective tests show that the proposed scheme improves the synthetic speech quality compared with that of a conventional one in a mobile device as well as a PC environment.

A STUDY ON THE SPEECH SYNTHESIS-BY-RULE SYSTEM APPLIED MULTIBAND EXCITATION SIGNAL

  • Kyung, Younjeong;Kim, Geesoon;Lee, Hwangsoo;Lee, Yanghee
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.1098-1103
    • /
    • 1994
  • In this paper, we design and implement the Korean speech synthesis by rule system. This system is applied the multiband excitation signal on voiced sounds. The multiband excitation signal is obtained by mixing impluse spectrum and which noise spectrum. We find that the quality of synthesized speech is improved using this application. Also, we classify the voiced sounds by cepstral euclidian distance measure for reducing overhead memory. The representative excitation signal of the same group's voiced sounds is used as excitation signal on synthesis. This method does not affect the quality of synthesized speech. As the result of experiment, this method eliminates the "buzziness" of synthesized speech and reduces the spectral distortion of synthesized speech.ed speech.

  • PDF

Real-time implementation and performance evaluation of speech classifiers in speech analysis-synthesis

  • Kumar, Sandeep
    • ETRI Journal
    • /
    • 제43권1호
    • /
    • pp.82-94
    • /
    • 2021
  • In this work, six voiced/unvoiced speech classifiers based on the autocorrelation function (ACF), average magnitude difference function (AMDF), cepstrum, weighted ACF (WACF), zero crossing rate and energy of the signal (ZCR-E), and neural networks (NNs) have been simulated and implemented in real time using the TMS320C6713 DSP starter kit. These speech classifiers have been integrated into a linear-predictive-coding-based speech analysis-synthesis system and their performance has been compared in terms of the percentage of the voiced/unvoiced classification accuracy, speech quality, and computation time. The results of the percentage of the voiced/unvoiced classification accuracy and speech quality show that the NN-based speech classifier performs better than the ACF-, AMDF-, cepstrum-, WACF- and ZCR-E-based speech classifiers for both clean and noisy environments. The computation time results show that the AMDF-based speech classifier is computationally simple, and thus its computation time is less than that of other speech classifiers, while that of the NN-based speech classifier is greater compared with other classifiers.