• Title/Summary/Keyword: Prosodic Features

Search Result 75, Processing Time 0.026 seconds

A study of prosodic features of patients with idiopathic Parkinson's disease (파킨슨병 환자와 정상노인 간의 문장 읽기에 나타난 운율 특성 비교)

  • Kang, Young-Ae;Seong, Cheol-Jae;Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.145-151
    • /
    • 2011
  • In view of the hypothesis that the effects of Parkinson's disease on voice production can be detected before pharmacological intervention, the prosodic features of patients with idiopathic Parkinson's disease (IPD) and a healthy aging group were diagnostically analyzed with the long term object of establishing, for clinical purposes, early disease-progression biomarkers. Twenty patients (male 8; female 12) with IPD (prior to pharmacological intervention) and a healthy control group of 22 (male 10; female 12) were selected. Ten sentences were recorded with a head-worn microphone. One sentence was chosen for the analysis of this paper. Relevant parameters, i.e. 3-dimensional model (F0, intensity, duration) and pitch and intensity related slopes (maxEnergy, maxF0, meanAbS, semiT, meanEnergy, meanF0), were analyzed by two-group discriminant analysis. The stepwise estimation method of discriminant analysis was performed by gender. The discriminant functions predicted 83.9% of the male test data correctly while the prediction rate was 93.1% for the female group. The results showed that meanF0_slope and semiT_slope were more important parameters than the others for the male group. For the female group, the meanEnergy_slope and maxEnergy_slope were the important ones. These findings indicate that significant parameters are different for the male and female group. Gender lifestyle may be responsible for this difference. Dysprosodic features of IPD show not simultaneously but progressively in terms of F0, intensity and duration.

  • PDF

Annotation of a Non-native English Speech Database by Korean Speakers

  • Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.111-135
    • /
    • 2002
  • An annotation model of a non-native speech database has been devised, wherein English is the target language and Korean is the native language. The proposed annotation model features overt transcription of predictable linguistic information in native speech by the dictionary entry and several predefined types of error specification found in native language transfer. The proposed model is, in that sense, different from other previously explored annotation models in the literature, most of which are based on native speech. The validity of the newly proposed model is revealed in its consistent annotation of 1) salient linguistic features of English, 2) contrastive linguistic features of English and Korean, 3) actual errors reported in the literature, and 4) the newly collected data in this study. The annotation method in this model adopts the widely accepted conventions, Speech Assessment Methods Phonetic Alphabet (SAMPA) and the TOnes and Break Indices (ToBI). In the proposed annotation model, SAMPA is exclusively employed for segmental transcription and ToBI for prosodic transcription. The annotation of non-native speech is used to assess speaking ability for English as Foreign Language (EFL) learners.

  • PDF

Feature Extraction Based on DBN-SVM for Tone Recognition

  • Chao, Hao;Song, Cheng;Lu, Bao-Yun;Liu, Yong-Li
    • Journal of Information Processing Systems
    • /
    • v.15 no.1
    • /
    • pp.91-99
    • /
    • 2019
  • An innovative tone modeling framework based on deep neural networks in tone recognition was proposed in this paper. In the framework, both the prosodic features and the articulatory features were firstly extracted as the raw input data. Then, a 5-layer-deep deep belief network was presented to obtain high-level tone features. Finally, support vector machine was trained to recognize tones. The 863-data corpus had been applied in experiments, and the results show that the proposed method helped improve the recognition accuracy significantly for all tone patterns. Meanwhile, the average tone recognition rate reached 83.03%, which is 8.61% higher than that of the original method.

Knowledge-driven speech features for detection of Korean-speaking children with autism spectrum disorder

  • Seonwoo Lee;Eun Jung Yeo;Sunhee Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.53-59
    • /
    • 2023
  • Detection of children with autism spectrum disorder (ASD) based on speech has relied on predefined feature sets due to their ease of use and the capabilities of speech analysis. However, clinical impressions may not be adequately captured due to the broad range and the large number of features included. This paper demonstrates that the knowledge-driven speech features (KDSFs) specifically tailored to the speech traits of ASD are more effective and efficient for detecting speech of ASD children from that of children with typical development (TD) than a predefined feature set, extended Geneva Minimalistic Acoustic Standard Parameter Set (eGeMAPS). The KDSFs encompass various speech characteristics related to frequency, voice quality, speech rate, and spectral features, that have been identified as corresponding to certain of their distinctive attributes of them. The speech dataset used for the experiments consists of 63 ASD children and 9 TD children. To alleviate the imbalance in the number of training utterances, a data augmentation technique was applied to TD children's utterances. The support vector machine (SVM) classifier trained with the KDSFs achieved an accuracy of 91.25%, surpassing the 88.08% obtained using the predefined set. This result underscores the importance of incorporating domain knowledge in the development of speech technologies for individuals with disorders.

A Study of Fundamental Frequency about Voice Imitation (모방발화의 기본주파수 연구)

  • Park, Mi-Young;Shin, Ji- Young;Kang, Sun-Mee
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.199-204
    • /
    • 2004
  • The purpose of this paper is to find prosodic characteristics in voice imitation. Speakers change various phonetic features in voice imitation. Speakers change their pitch ranges in the most cases. Especially, the pitch range is important for word conditions. And, as imitators change the voice, the average value of f0 is close to high frequence than low frequence or middle level.

  • PDF

Prosodic Features at "Sentence Boundaries" in Oral Presentations

  • Umesaki, Atsuko-Furuta
    • MALSORI
    • /
    • no.41
    • /
    • pp.83-96
    • /
    • 2001
  • It is generally said that falling intonation is used at the end of a declarative sentence. However, this is not the case with all stretches of spontaneous speech which are marked in transcription as sentences. The present paper examines intonation patterns appearing at the end of declarative sentences in oral presentations, and discusses instances where falling intonation does not appear. The texts used for analysis are eight oral presentations collected at international conferences in the field of physics. Quantitative and qualitative analyses are carried out. Three major factors related to discourse structure have been found for non-occurrence of falling intonation at sentence boundaries.

  • PDF

Prosodic Features at "Sentence Boundaries" in Oral Presentations

  • Umesaki, Atsuko-Furuta
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.149-164
    • /
    • 2000
  • It is generally said that falling intonation is used at the end of a declarative sentence. However, this is not the case with all stretches of spontaneous speech which are marked in transcription as sentences. The present paper examines intonation patterns appearing at the end of declarative sentences in oral presentations, and discusses instances where falling intonation does not appear. The texts used for analysis are eight oral presentations collected at international conferences in the field of physics. Quantitative and qualitative analyses are carried out. Three major factors related to discourse structure have been found for nonoccurrence of falling intonation at sentence boundaries.

  • PDF

Perceptive evaluation of Korean native speakers on the polysemic sentence final ending produced by Chinese Korean learners (KFL중국인학습자들의 한국어 동형다의 종결어미 발화문에 대한 원어민화자의 지각 평가 양상)

  • Yune, Youngsook
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.27-36
    • /
    • 2020
  • The aim of this study is to investigate the perceptive aspects of the polysemic sentence final ending "-(eu)lgeol" produced by Chinese Korean learners. "-(Eu)lgeol" has two different meanings, that is, a guess and a regret, and these different meanings are expressed by the different prosodic features of the last syllable of "-(eu)lgeol". To examine how Korean native speakers perceive "-(eu)lgeol" sentences produced by Chinese Korean learners and the most saliant prosodic variable for the semantic discrimination of "-(eu)lgeol" at the perceptive level, we performed a perceptual experiment. The analysed material constituted four Korean sentences containing "-(eu)lgeol" in which two sentences expressed guesses and the other two expressed regret. Twenty-five Korean native speakers participated in the perceptual experiment. Participants were asked to mark whether "-(eu)lgeol" sentences they listened to were (1) definitely regrets, (2) probably regrets, (3) ambiguous, (4) probably guesses, or (5) definitely guesses based on the prosodic features of the last syllable of "-(eu)lgeol". The analysed prosodic variables were sentence boundary tones, slopes of boundary tones, pitch difference between sentence-final and penultimate syllables, and pitch levels of boundary tones. The results show that all the analysed prosodic variables are significantly correlated with the semantic discrimination of "-(eu)lgeol" and among these prosodic variables, the most salient role in the semantic discrimination of "-(eu)lgeol" is pitch difference between sentence-final syllable and penultimate syllable.

Computer Codes for Korean Sounds: K-SAMPA

  • Kim, Jong-mi
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4E
    • /
    • pp.3-16
    • /
    • 2001
  • An ASCII encoding of Korean has been developed for extended phonetic transcription of the Speech Assessment Methods Phonetic Alphabet (SAMPA). SAMPA is a machine-readable phonetic alphabet used for multilingual computing. It has been developed since 1987 and extended to more than twenty languages. The motivating factor for creating Korean SAMPA (K-SAMPA) is to label Korean speech for a multilingual corpus or to transcribe native language (Ll) interfered pronunciation of a second language learner for bilingual education. Korean SAMPA represents each Korean allophone with a particular SAMPA symbol. Sounds that closely resemble it are represented by the same symbol, regardless of the language they are uttered in. Each of its symbols represents a speech sound that is spectrally and temporally so distinct as to be perceptually different when the components are heard in isolation. Each type of sound has a separate IPA-like designation. Korean SAMPA is superior to other transcription systems with similar objectives. It describes better the cross-linguistic sound quality of Korean than the official Romanization system, proclaimed by the Korean government in July 2000, because it uses an internationally shared phonetic alphabet. It is also phonetically more accurate than the official Romanization in that it dispenses with orthographic adjustments. It is also more convenient for computing than the International Phonetic Alphabet (IPA) because it consists of the symbols on a standard keyboard. This paper demonstrates how the Korean SAMPA can express allophonic details and prosodic features by adopting the transcription conventions of the extended SAMPA (X-SAMPA) and the prosodic SAMPA(SAMPROSA).

  • PDF

SOME PROSODIC FEATURES OBSERVED IN THE PASSAGE READING BY JAPANESE LEARNERS OF ENGLISH

  • Kanzaki, Kazuo
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.37-42
    • /
    • 1996
  • This study aims to see some prosodic features of English spoken by Japanese learners of English. It focuses on speech rates, pauses, and intonation when the learners read an English passage. Three Japanese learners of English, who are all male university students, were asked to read the speech material, an English passage of 110 word length, at their normal reading speed. Then a native speaker of English, a male American English teacher. was asked to read the same passage. The Japanese speakers were also asked to read a Japanese passage of 286 letters (Japanese Kana) to compare the reading of English with that of japanese. Their speech was analyzed on a computerized system (KAY Computerized Speech Lab). Wave forms, spectrograms, and F0 contours were shown on the screen to measure the duration of pauses, phrases and sentences and to observe intonation contours. One finding of the experiment was that the movement of the low speakers' speech rates showed a similar tendency in their reading of the English passage. Reading of the Japanese passage by the three learners also had a similar tendency in the movement of speech rates. Another finding was that the frequency of pauses in the learners speech was greater than that in the speech of the native speaker, but that the ration of the total pause length to the whole utterance length was about tile same in both the learners' and the native speaker's speech. A similar tendency was observed about the learners' reading of the Japanese passage except that they used shorter pauses in the mid-sentence position. As to intonation contours, we found that the learners used a narrower pitch range than the native speaker in their reading of the English passage while they used a wider pitch range as they read the Japanese passage. It was found that the learners tended to use falling intonation before pauses whereas the native speaker used different intonation patterns. These findings are applicable to the teaching of English pronunciation at the passage level in the sense that they can show the learners. Japanese here, what their problems are and how they could be solved.

  • PDF