• Title/Summary/Keyword: Speech level

Search Result 678, Processing Time 0.023 seconds

Effects of the Orthographic Representation on Speech Sound Segmentation in Children Aged 5-6 Years (5~6세 아동의 철자표상이 말소리분절 과제 수행에 미치는 영향)

  • Maeng, Hyeon-Su;Ha, Ji-Wan
    • Journal of Digital Convergence
    • /
    • v.14 no.6
    • /
    • pp.499-511
    • /
    • 2016
  • The aim of this study was to find out effect of the orthographic representation on speech sound segmentation performance. Children's performances of the orthographic representation task and the speech sound segmentation task had positive correlation in words of phoneme-grapheme correspondence and negative correlation in words of phoneme-grapheme non-correspondence. In the case of words of phoneme-grapheme correspondence, there was no difference in performance ability between orthographic representation high level group and low level group, while in the case of words of phoneme-grapheme non-correspondence, the low level group's performance was significantly better than the high level group's. The most frequent errors of both groups were orthographic conversion errors and such errors were significantly more noticeable in the high level group. This study suggests that from the time of learning orthographic knowledge, children utilize orthographic knowledge for the performance of phonological awareness tasks.

How to Express Emotion: Role of Prosody and Voice Quality Parameters (감정 표현 방법: 운율과 음질의 역할)

  • Lee, Sang-Min;Lee, Ho-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.11
    • /
    • pp.159-166
    • /
    • 2014
  • In this paper, we examine the role of emotional acoustic cues including both prosody and voice quality parameters for the modification of a word sense. For the extraction of prosody parameters and voice quality parameters, we used 60 pieces of speech data spoken by six speakers with five different emotional states. We analyzed eight different emotional acoustic cues, and used a discriminant analysis technique in order to find the dominant sequence of acoustic cues. As a result, we found that anger has a close relation with intensity level and 2nd formant bandwidth range; joy has a relative relation with the position of 2nd and 3rd formant values and intensity level; sadness has a strong relation only with prosody cues such as intensity level and pitch level; and fear has a relation with pitch level and 2nd formant value with its bandwidth range. These findings can be used as the guideline for find-tuning an emotional spoken language generation system, because these distinct sequences of acoustic cues reveal the subtle characteristics of each emotional state.

PESQ-Based Selection of Efficient Partial Encryption Set for Compressed Speech

  • Yang, Hae-Yong;Lee, Kyung-Hoon;Lee, Sang-Han;Ko, Sung-Jea
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.408-418
    • /
    • 2009
  • Adopting an encryption function in voice over Wi-Fi service incurs problems such as additional power consumption and degradation of communication quality. To overcome these problems, a partial encryption (PE) algorithm for compressed speech was recently introduced. However, from the security point of view, the partial encryption sets (PESs) of the conventional PE algorithm still have much room for improvement. This paper proposes a new selection method for finding a smaller PES while maintaining the security level of encrypted speech. The proposed PES selection method employs the perceptual evaluation of the speech quality (PESQ) algorithm to objectively measure the distortion of speech. The proposed method is applied to the ITU-T G.729 speech codec, and content protection capability is verified by a range of tests and a reconstruction attack. The experimental results show that encrypting only 20% of the compressed bitstream is sufficient to effectively hide the entire content of speech.

Chinese Prosody Generation Based on C-ToBI Representation for Text-to-Speech (음성합성을 위한 C-ToBI기반의 중국어 운율 경계와 F0 contour 생성)

  • Kim, Seung-Won;Zheng, Yu;Lee, Gary-Geunbae;Kim, Byeong-Chang
    • MALSORI
    • /
    • no.53
    • /
    • pp.75-92
    • /
    • 2005
  • Prosody Generation Based on C-ToBI Representation for Text-to-SpeechSeungwon Kim, Yu Zheng, Gary Geunbae Lee, Byeongchang KimProsody modeling is critical in developing text-to-speech (TTS) systems where speech synthesis is used to automatically generate natural speech. In this paper, we present a prosody generation architecture based on Chinese Tone and Break Index (C-ToBI) representation. ToBI is a multi-tier representation system based on linguistic knowledge to transcribe events in an utterance. The TTS system which adopts ToBI as an intermediate representation is known to exhibit higher flexibility, modularity and domain/task portability compared with the direct prosody generation TTS systems. However, the cost of corpus preparation is very expensive for practical-level performance because the ToBI labeled corpus has been manually constructed by many prosody experts and normally requires a large amount of data for accurate statistical prosody modeling. This paper proposes a new method which transcribes the C-ToBI labels automatically in Chinese speech. We model Chinese prosody generation as a classification problem and apply conditional Maximum Entropy (ME) classification to this problem. We empirically verify the usefulness of various natural language and phonology features to make well-integrated features for ME framework.

  • PDF

Speech Outcomes of Submucous Cleft Palate Children With Double Opposing Z-Plasty Operation (Double Opposing Z-Plasty 수술 후의 점막하 구개열 아동의 말소리 개선에 관한 연구)

  • 최홍식;홍진희;김정홍;최성희;최재남;남지인
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.13 no.2
    • /
    • pp.180-187
    • /
    • 2002
  • Background and Objectives : The operation Double Opposing Z-Plasty, has been used for improving VPI function in the submucous cleft palate. However, few reports on the effects of the speech change were presented. The purpose of this study is to compare the difference of nasality and nasalance, parents satisfaction between before and after this operation and to consider how much improvement in speech. Materials and Methods : Ten submucous cleft palate children who underwent double opposing Z-plasty were analyzed. We retrospectively studied nasalance, auditory perception (nasality) with hypernasality, patients satisfaction, speech evaluation by using charts review, video tape, telephone interview. Results : In 8 patients of 10 submucous cleft palate, hypernasality reduced and speech intelligibility was higher and mean 0.35 point was increased in the velum length after operation. After operation, nasality was improved (2.0 point) and level of nasal emission decreased. Regarding satisfaction of this operation, scale was mean 2.8 (5 point-scale) : 8 parents were satisfied in the resonance, 3 parents were satisfied articulation. The reason of dissatisfaction was mostly compensatory articulation. Conclusion : To improve of speech in the submucous cleft palate, speech therapy afterthis operation as well as successful surgery should be considered.

  • PDF

Robust Voice Activity Detection in Noisy Environment Using Entropy and Harmonics Detection (엔트로피와 하모닉 검출을 이용한 잡음환경에 강인한 음성검출)

  • Choi, Gab-Keun;Kim, Soon-Hyob
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.1
    • /
    • pp.169-174
    • /
    • 2010
  • This paper explains end-point detection method for better speech recognition rates. The proposed method determines speech and non-speech region with the entropy and the harmonic detection of speech. The end-point detection using entropy on the speech spectral energy has good performance at the high SNR(SNR 15dB) environments. At the low SNR environment(SNR 0dB), however, the threshold level of speech and noise varies, so the precise end-point detection is difficult. Therefore, this paper introduces the end-point detection methods which uses speech spectral entropy and harmonics. Experiment shows better performance than the conventional entropy methods.

The Effects of Voice and Speech Intelligibility Improvements in Parkinson Disease by Training Loudness and Pitch: A Case Study (강도 및 음도 조절을 이용한 훈련이 파킨슨병 환자의 음성 및 발화명료도 개선에 미치는 효과: 사례연구)

  • Lee, Ok-Bun;Jeong, Ok-Ran;Ko, Do-Heung
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.173-184
    • /
    • 2001
  • The purpose of this study was to examine the effects of manipulating loudness and pitch in terms of speech intelligibility and voice of a patient with Parkinson's Disease. The subject, who was diagnosed as a patient with Parkinson's disease 11 years ago, demonstrated a severely breath voice with low intensity. The accuracy of articulation in consonants was intelligible only at the single word level, and the overall intelligibility in continuous speech was low. The results showed that the subject's articulation accuracy and speech intelligibility was significantly improved after having loudness and pitch training. Habitual Fo, Jitter, Shimmer, Fo tremor, Amp tremor were decreased after training. In addition, the value of HNR also increased after training. It was shown that the changes of these acoustic parameters were closely related to the decrease of breathiness in Parkinson's voice, and this decrease of breathiness affected speech intelligibility considerably. Based on the experimental results, it was claimed that the vocal training by manipulating the loudness and pitch could be highly effective in improving the voice quality and speech intelligibility in Parkinson's Disease.

  • PDF

A Study on the Intonation Contours of Students' Groups by Oral Proficiency Level (말하기 숙달도에 따른 대학생 집단별 억양곡선 고찰)

  • Yang, Byung-Gon;Seo, Jun-Young
    • Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.77-89
    • /
    • 2007
  • This paper examined the intonation contours of English sentences produced by the Korean students. Thirty students participated in the speaking tasks made up of three parts: an oral interview, picture description and a conversational text reading. Their pronunciations in the parts were recorded on a minidisk. Then, two native instructors evaluated their proficiency level focusing on general intelligibility and suprasegmental aspects of the speech. Based on the results of evaluation they were divided into two groups: high and low proficiency groups. The pitch contours of three sentences produced by both the Korean students and a native speaker were compared to find any similarities and differences in the students' intonation patterns using Praat. Results showed that there was a moderate correlation in the proficiency scores of the students by the two native speakers. Secondly, students who earned high scores in the proficiency level matched better the native model. Thirdly, the high group students knew more on the pitch contour and tried to carefully realize them while fewer students in the low group answered positively on the questionnaire. In conclusion, English learners need to know the proper intonation patterns and to practice them consciously and sufficiently to realize correct intonation contours. Further studies would be desirable on the students' pronunciation focused on discourse structure.

  • PDF

Text to Speech System from Web Images (웹상의 영상 내의 문자 인식과 음성 전환 시스템)

  • 안희임;정기철
    • Proceedings of the IEEK Conference
    • /
    • 2001.06c
    • /
    • pp.5-8
    • /
    • 2001
  • The computer programs based upon graphic user interface(GUI) became commonplace with the advance of computer technology. Nevertheless, programs for the visually-handicapped have still remained at the level of TTS(text to speech) programs and this prevents many visually-handicapped from enjoying the pleasure and convenience of the information age. This paper is, paying attention to the importance of character recognition in images, about the configuration of the system that converts text in the image selected by a user to the speech by extracting the character part, and carrying out character recognition.

  • PDF

Change of Dialect after Stroke (뇌졸중 후에 나타난 방언의 변화)

  • Kwon, Mi-Seon;Kim, Jong-S
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.127-130
    • /
    • 2005
  • Foreign Accent syndrome refers to segmental and suprasegmental changes of speech characteristics following brain lesion which is perceived by listeners as a foreign accent. Change in dialect after a stroke, however, have rarely been reported. We describe a patient who showed prominent change of accent from one to another Korean dialect and discuss about the alteration of prosodic patterns and the changes in segmental level of speech.

  • PDF