• Title/Summary/Keyword: read speech

Search Result 163, Processing Time 0.024 seconds

Voice Similarities between Sisters

  • Ko, Do-Heung
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.43-50
    • /
    • 2001
  • This paper deals with voice similarities between sisters who are supposed to have common physiological characteristics from a single biological mother. Nine pairs of sisters who are believed to have similar voices participated in this experiment. The speech samples obtained from one pair of sisters were eliminated in the analysis because their perceptual score was relatively low. The words were measured in both isolation and context, and the subjects were asked to read the text five times with about three seconds of interval between readings. Recordings were made at natural speed in a quiet room. The data were analyzed in pitch and formant frequencies using CSL (Computerized Speech Lab) and PCQuirer. It was found that data of the initial vowels are much more similar and homogeneous than those of vowels in other positions. The acoustic data showed that voice similarities are strikingly high in both pitch and formant frequencies. It is assumed that statistical data obtained from this experiment can be used as a guideline for modelling speaker identification and speaker verification.

  • PDF

The effect of pronunciation teaching on the realization of English rhythm by Korean learners of English

  • Choe, Wook Kyung
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.19-28
    • /
    • 2022
  • The current study was designed to explore whether taking English pronunciation classes could improve the realization of English rhythm by Korean learners of English. Specifically, this study used various rhythm metrics to examine the extent to which the learners' speech became rhythmically similar to the target language after taking classes that focused on English pronunciation. Sixteen learners who took a 15-week English pronunciation course at a university read an English passage twice (at the beginning and the end of the semester). The rhythm metrics such as Deltas, Varcos, and Pairwise Variability Indices were calculated for the learners' speech, as well as that of 8 native speakers of English. The results demonstrated that the learners' speech was slower, and they put more frequent within-sentence pauses than the native speakers even after the classes. The analyses also indicated that the speech recorded at the beginning of the semester was rhythmically much more different from the target language than at the end of the semester. After the classes, however, the learners' consonantal intervals became much more target-like, while the vocalic intervals were rhythmically even further from those in the target language. Overall, the findings suggested that the pronunciation classes helped the learners to produce English speech that was rhythmically similar to the native speakers.

How Different are Learner Speech and Loanword Phonology?

  • Kim, Jong-Mi
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.3-18
    • /
    • 2009
  • Do loanword properties emerge in the acquisition of a foreign language and if so, how? Classic studies in adult language learning assumed loanword properties that range from near-ceiling to near-chance level of appearance depending on speech proficiency. The present research argues that such variations reflect different phonological types, rather than speech proficiency. To investigate the difference between learner speech and loanword phonology, the current research analyzes the speech data from five different proficiency levels of 92 Korean speakers who read 19 pairs of English words and sentences that contained loanwords. The experimental method is primarily an acoustical one, by which the phonological cause in the loanwords (e.g., the insertion of [$\Box$] at the end of the word stamp) would be attested to appear in learner speech, in comparison with native speech from 11 English speakers and 11 Korean speakers. The data investigated for the research are of segment deletion, insertion, substitution, and alternation in both learner speech and the native speech. The results indicate that learner speech does not present the loanword properties in many cases, but depends on the types of phonological causes. The relatively easy acquisition of target pronunciation is evidenced in the cases of segment deletion, insertion, substitution, and alternation, except when the loanword property involves the successful command of the target phonology such as the de-aspiration of [p] in apple. Such a case of difficult learning draws a sharp distinction from the cases of easy learning in the development of learner speech, particularly beyond the intermediate level of proficiency. Overall, learner speech departs from loanword phonology and develops toward the native speech value, depending on phonological contrasts in the native and foreign languages.

  • PDF

The relationship between fluency levels and suprasegmentals according to the sentence types in the English read speech by Korean middle school English learners (한국 중학생의 영어 읽기 발화에서 문장유형에 따른 유창성 등급과 초분절 요소의 관계)

  • Kim, Hwa-Young
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.51-66
    • /
    • 2022
  • This study aims to help Korean English learners to learn English pronunciation by revealing which suprasegmentals affect the implementation of English sentences closer to native English speakers when they read English sentences. To this end, Korean middle school English learners were selected as subjects and research data were gathered through sentence types (declarative, interrogative, imperative, and exclamative), as well as syllables. Speech rate, pause frequency, pause duration, F0 range, and rhythm among suprasegmentals were used for analysis of these English sentence utterances. Mean analysis, correlation analysis, and regression analysis were performed. The results showed that speech rate, pause frequency, pause duration, and F0 range affected the evaluation of fluency levels. In the regression analysis between all suprasegmentals and fluency levels, the suprasegmentals that most affected fluency levels were speech rate and F0 range. Rhythm had no meaningful relation with fluency levels. Therefore, when teaching English pronunciation, it is necessary to teach students to increase their speech rate and F0 range. In addition, students should be trained to reduce both the number and the duration of pauses during utterance to improve their fluency. It is noteworthy that of the four sentence types, exclamative sentences were produced with faster speech rate, fewer pauses, shorter pause duration, and higher rhythm values.

Executive function and Korean children's stop production

  • Eun Jong Kong;Hyunjung Lee;Jeffrey J. Holliday
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.45-52
    • /
    • 2023
  • Previous studies have established a role for cognitive differences in explaining variability in speech processing across individuals. In the case of perceptual cue weighting in the context of a sound change, studies have produced conflicting results regarding the relationship between executive function and the use of redundant cues. The current study aimed to explore this relationship in acoustic cue weighting during speech production. Forty-one Korean-speaking children read a list of stop-initial words and completed two tests that assess executive function, i.e., Dimensional Change Card Sorting (DCCS) and digit n-back. Voice onset time (VOT) and fundamental frequency (F0) were measured in each word, and analyses were carried out to determine the extent to which children's executive function predicted their use of both informative and less informative cues to the three pairs comprising the Korean three-way stop laryngeal contrast. No evidence was found for a relationship between cognitive ability and acoustic cue weighting in production, which is at odds with previous, albeit conflicting, results for speech perception. While this result may be due to the lack of task demands in the production task used here, it nevertheless expands the empirical ground upon which future work in this area may proceed.

Table Structure Recognition in Images for Newspaper Reader Application for the Blind (시각 장애인용 신문 구독 프로그램을 위한 이미지에서 표 구조 인식)

  • Kim, Jee Woong;Yi, Kang;Kim, Kyung-Mi
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.11
    • /
    • pp.1837-1851
    • /
    • 2016
  • Newspaper reader mobile applications using text-to-speech (TTS) function enable blind people to read newspaper contents. But, tables cannot be easily read by the reader program because most of the tables are stored as images in the contents. Even though we try to use OCR (Optical character reader) programs to recognize letters from the table images, it cannot be simply applied to the table reading function because the table structure is unknown to the readers. Therefore, identification of exact location of each table cell that contains the text of the table is required beforehand. In this paper, we propose an efficient image processing algorithm to recognize all the cells in tables by identifying columns and rows in table images. From the cell location data provided by the table column and row identification algorithm, we can generate table structure information and table reading scenarios. Our experimental results with table images found commonly in newspapers show that our cell identification approach has 100% accuracy for simple black and white table images and about 99.7% accuracy for colored and complicated tables.

A Study of an Independent Evaluation of Prosody and Segmentals: With Reference to the Difference in the Evaluation of English Pronunciation across Subject Groups (운율 및 분절음의 독립적 발음 평가 연구: 평가자 집단의 언어별 차이를 중심으로)

  • Park, Hansang
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.91-98
    • /
    • 2013
  • This study investigates the difference in the evaluation of foreign-accentedness of English pronunciation across subject groups, evaluated accents, and compared components. This study independently evaluates the prosody and segmentals of the foreign-accented English sentences by pairwise difference rating. Using the prosody swapping technique, segmentals and prosody of the English sentences read by native speakers of American English (one male and one female) were combined with the corresponding segmentals and prosody of the English sentences read by male and female native speakers of Chinese, Japanese or Korean (one male and one female from each native language). These stimuli were evaluated by 4 different subject groups: native speakers of American English, Korean, Chinese, and Japanese. The results showed that the Japanese subject group scored higher in prosody difference than in segmental difference while the other groups scored the other way around. This study is significant in that the attitude toward the difference in segmentals and prosody of the foreign accents of English varies with the native language of the subject group. In other words, for native speakers of some languages, the difference in prosody could have a greater influence on the foreign-accentedness than the difference in segmentals, while for native speakers of other languages the other way around.

Elementary School Aged Children's Reading Fluency in Terms of Family Income and Receptive Vocabulary (소득수준과 언어수준에 따른 초등생의 읽기유창성 비교)

  • Ku, Kayoung;Seol, Ahyoung;Pae, Soyeong
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.29-38
    • /
    • 2015
  • This study explores reading fluency among elementary school students considering language level and family income(low SES). Forty eight students from 1st to 3rd grades participated in two paragraph reading tasks. Half of the children were from low income family and half of the children had low lexical knowledge. Reading fluency as in the number of correctly read syllables per minute, the total error frequency and error types were used to compare group differences. There were significant differences in the number of correctly read syllables per minute between two income groups and two language groups. There was a significant difference between low income group and non-low income group in total number of errors only when children's lexical knowledge were low. There were no group differences in error types of repetition and omission. Substitution and insertion error seemed to reflect the total error pattern. These results imply the importance of early screening and early involvement for children with low lexical knowledge from low income family. Monitoring and early intervention will support these children's reading development.

The Prosodic Characteristics of Korean Read Sentences in Dicourse Context (한국어 낭독체 담화문의 운율적 특징 - 단독발화문과 연속발화문의 비교를 통하여 -)

  • Seong Cheol-Jae
    • MALSORI
    • /
    • no.35_36
    • /
    • pp.1-12
    • /
    • 1998
  • This study aims to investigate the prosodic characteristics of Korean discourse sentences, especially focusing the initial and final part of a sentence. 50 disourse sentences were read in two different styles; one, sentence by sentence, the other, continuous of all 50's. First, we tried to get two kinds of ratios from the acoustic results: first, ratio of the final syllable to the initial syllable in first word in a sentence; second, ratio of the final syllable to the initial syllable in last word in a sentence. We, then, calculated statistical values of the ratios including mean, standard deviation, minimum, maximum, and p-values in t-test. With respect to duration, there were little difference between two different styles. If any, we could see tiny unharmonious durational aspect in the initial of continuous reading. More concisely, there could be observed some deviation from standard. In case of F0, there was prominent statistical difference between ratios of last words in two styles. This difference might play a role as a prosodic feature. Energy seems to show similar pattern with that of F0. The results showed that final syllable in last word was pronounced with about 85 % of initial syllable in the same context and the last words in continuous speech were strongly articulated compared with those of sentence by sentence reading.

  • PDF

Performance of Korean spontaneous speech recognizers based on an extended phone set derived from acoustic data (음향 데이터로부터 얻은 확장된 음소 단위를 이용한 한국어 자유발화 음성인식기의 성능)

  • Bang, Jeong-Uk;Kim, Sang-Hun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.39-47
    • /
    • 2019
  • We propose a method to improve the performance of spontaneous speech recognizers by extending their phone set using speech data. In the proposed method, we first extract variable-length phoneme-level segments from broadcast speech signals, and convert them to fixed-length latent vectors using an long short-term memory (LSTM) classifier. We then cluster acoustically similar latent vectors and build a new phone set by choosing the number of clusters with the lowest Davies-Bouldin index. We also update the lexicon of the speech recognizer by choosing the pronunciation sequence of each word with the highest conditional probability. In order to analyze the acoustic characteristics of the new phone set, we visualize its spectral patterns and segment duration. Through speech recognition experiments using a larger training data set than our own previous work, we confirm that the new phone set yields better performance than the conventional phoneme-based and grapheme-based units in both spontaneous speech recognition and read speech recognition.