• Title/Summary/Keyword: speech factors

Search Result 352, Processing Time 0.021 seconds

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • Korean Journal of Audiology
    • /
    • v.23 no.4
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

A Comparative Study on the Speech Rate of Advanced Korean(L2) Learners and Korean Native Speakers in Conversational Speech (자유 대화에서의 한국어 원어민 화자와 한국어 고급 학습자들의 발화 속도 비교)

  • Hong, Minkyoung
    • Journal of Korean language education
    • /
    • v.29 no.3
    • /
    • pp.345-363
    • /
    • 2018
  • The purpose of this study is to compare the speech rate of advanced Korean(L2) learners and Korean native speakers in spontaneous utterances. Specifically, the current study investigated the difference of the two groups' speech pattern according to utterance length. Eight advanced Korean(L2) learners and eight Korean native speakers participated in this study. The data were collected by recording their conversation and physical measurements (speaking rate, articulatory rates, pause and several types of speech disfluency) were taken on extracted 120 utterances from 12 out of the 16 participants. The findings show that advanced Korean learners' speech pattern is similar to that of Koreans in the short-length utterance. However, in the long-length utterance, two groups show different speech patterns; while the articulatory rate of Korean native speakers increased in the long-length utterance, that of Korean learners decreased. This suggests that the frequency of speech disfluency factors might affect this result.

A Study of Subjective Speech Quality Measurement in VoIP (VoIP 음질의 주관적 평가에 관한 연구)

  • 강영도;강진석;최연성;김장형
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.2
    • /
    • pp.279-287
    • /
    • 2001
  • In this paper, we discuss the scale of subjective speech quality measurement over VoIP(Voice over IP) network which is a component of broadband networks. Objective parameters of multimedia services like PSNR or jitter can easily measured and defined, but these factors are not easily meet the user's perceptual recognition. We suggest the speech quality measurement scale through the subjective measurement for end-to-end speech quality composed of sender-side quality, transmission quality, receiver-side quality, which provide the degree of correctness of representation of speaker, the degree of impairment caused by various factors, the degree of recognition of processed speech, respectively. Also, we examined the proposed method and verify it's availability.

  • PDF

Speech Database Design and Structuring for High Quality TTS (고품질 음성합성을 위한 합성 DB 구축)

  • Kang Dong-Gyu;Yi Sionghun;Ryu Won-Ho
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.33-36
    • /
    • 2002
  • As the telematics service that is the integration of information technology approaches commercialization, the necessity and gravity of speech technology is rapidly growing. The speech technology occupies important position in the telematics service because it informs the starting of service and the retrieved result. This service must provide high accuracy of speech recognition and natural synthesis of human speech in a driving environment and it is especially true for the fee-for-service. For high quality TTS, the speech synthesis technique that makes optimal synthesis database and uses efficiently this database is required. In this paper, we describe the design of phonetically balanced sentences used for speech database, the selection of service-suitable-speaker, the extraction methods of accurate phoneme boundary, and the factors which are taken into consideration in the extraction stage of prosody. Finally we show the real case that has commercially implemented.

  • PDF

Prosodic Properties in the Speech of Adults with Cerebral Palsy (뇌성마비 성인 발화의 운율특성)

  • Lee, Sook-Hyang;Ko, Hyun-Ju;Kim, Soo-Jin
    • MALSORI
    • /
    • no.64
    • /
    • pp.39-51
    • /
    • 2007
  • The purpose of this study is to investigate prosodic characteristics in the speech of adults with cerebral palsy through a comparison with the speech of normal speakers. Ten speakers with cerebral palsy (6 males, 4 females) and 6 normal speakers (3 males, 3 females) served as subjects. The results revealed that, compared to normal speakers, speakers with cerebral palsy showed a slower speech rate, a larger number of intonational phrases(IPs) and pauses, a larger number of accentual phrases(APs) per IP, a longer duration of pauses, and more gradual slopes of [L +H] in APs. However, the two groups showed similar tone patterns in their APs. The results also showed mild to moderate correlations between speech intelligibility and the prosodic properties which showed significant differences between the two groups, suggesting that they could be important prosodic factors to predict speech intelligibility in the speech of adults with cerebral palsy.

  • PDF

An Investigation for Design and Implementation of an Integrated Data Management System of Various Speech Corpora (다양한 음성코퍼스의 통합관리시스템의 설계 및 구현에 관한 검토)

  • Hwang Kyunghun;Jeong Changwon;Kim Youngil;Kim Bongwan;Lee Yongju
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.69-72
    • /
    • 2003
  • In this paper, we investigate various factors that are relevant to design and implementation of an integrated management system for various speech corpora. The purpose of this paper is to manage an integrated management system for various kinds of speech corpora necessary for speech research and speech corpora consrtructed in different data formats. In addition, ways are considered to allow users to search with effect for speech corpora that meet various conditions which they want, and to allow them to add with ease corpora that are constructed newly. In order to achieve this goal, we design a global schema for an integrated management of new additional information without changing old speech corpora, and construct a web-based integrated management system based on the scheme that can be accessed without any temporal and spatial restrictions. And we show the steps by which these can be implemented, and describe related future study topics, examining the system.

  • PDF

A study on the voice onset times of the Seoul Corpus males in their twenties (서울 코퍼스 20대 남성의 성대진동 개시시간 연구)

  • Lee, Yuri;Yoon, Kyuchul
    • Phonetics and Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.1-8
    • /
    • 2016
  • The purpose of this work is to examine the voice onset times (VOTs) of the three types of plosives from the Seoul Corpus male speakers in their twenties. In addition, the factors known to affect VOTs were analyzed, including the place and manner of articulation, speakers, location in words, type of following vowels and speech rates calculated from the three consecutive words. Much of the findings agreed with those from earlier studies on Korean and other languages and new discoveries were made.

Implementation of Voice Source Simulator Using Simulink (Simulink를 이용한 음원모델 시뮬레이터 구현)

  • Jo, Cheol-Woo;Kim, Jae-Hee
    • Phonetics and Speech Sciences
    • /
    • v.3 no.2
    • /
    • pp.89-96
    • /
    • 2011
  • In this paper, details of the design and implementation of a voice source simulator using Simulink and Matlab are discussed. This simulator is an implementation by model-based design concept. Voice sources can be analyzed and manipulated through various factors by choosing options from GUI input and selecting pre-defined blocks or user created ones. This kind of simulation tool can simplify the procedure of analyzing speech signals for various purposes such as voice quality analysis, pathological voice analysis, and speech coding. Also, basic analysis functions are supported to compare the original signal and the manipulated ones.

  • PDF

A Comparative Study of Voice Activity Detection Algorithms in Adverse Environments (잡음 환경에서의 음성 검출 알고리즘 비교 연구)

  • Yang Kyong-Chul;Yook Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.45-48
    • /
    • 2006
  • As the speech recognition systems are used in many emerging applications, robust performance of speech recognition systems under extremely noisy conditions become more important. The voice activity detection (VAD) has been taken into account as one of the important factors for robust speech recognition. In this paper, we investigate conventional VAD algorithms and analyze the weak and the strong points of each algorithm.

  • PDF

A Study on Voice Color Control Rules for Speech Synthesis System (음성합성시스템을 위한 음색제어규칙 연구)

  • Kim, Jin-Young;Eom, Ki-Wan
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.25-44
    • /
    • 1997
  • When listening the various speech synthesis systems developed and being used in our country, we find that though the quality of these systems has improved, they lack naturalness. Moreover, since the voice color of these systems are limited to only one recorded speech DB, it is necessary to record another speech DB to create different voice colors. 'Voice Color' is an abstract concept that characterizes voice personality. So speech synthesis systems need a voice color control function to create various voices. The aim of this study is to examine several factors of voice color control rules for the text-to-speech system which makes natural and various voice types for the sounding of synthetic speech. In order to find such rules from natural speech, glottal source parameters and frequency characteristics of the vocal tract for several voice colors have been studied. In this paper voice colors were catalogued as: deep, sonorous, thick, soft, harsh, high tone, shrill, and weak. For the voice source model, the LF-model was used and for the frequency characteristics of vocal tract, the formant frequencies, bandwidths, and amplitudes were used. These acoustic parameters were tested through multiple regression analysis to achieve the general relation between these parameters and voice colors.

  • PDF