• Title/Summary/Keyword: Speech rate

Search Result 1,242, Processing Time 0.031 seconds

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

  • Byun, Hi-Gyung
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.1-14
    • /
    • 2020
  • Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.

Comparison of Cognitive Response Time according to Ageing and Cognitive Ability (노화 및 인지 능력에 따른 인지반응시간 비교)

  • Kim, Eun-Mi;Kim, Jung-Wan
    • Therapeutic Science for Rehabilitation
    • /
    • v.10 no.4
    • /
    • pp.81-94
    • /
    • 2021
  • Objective : Response time plays a prominent part in research on cognitive ability and the aging effect. This study aimed to identify the impact of cognitive ability on information processing by conducting cognitive response time (CRT) using a computer program. Methods : This study was conducted in 30 normal elderly (NE) and 30 elderly with amnestic MCI (aMCI), aged 65-79 years old living in Daegu and Gyeongbuk. The results were analyzed using the statistical analysis program R 4.0.2 (University of Auckland, New Zealand). Results : In the three sub-areas of CRT, the total response time showed a significant difference depending on group or age, and the error rate showed a significant difference depending on age or group in some sub-areas. In the aMCI group, the performance of CRT significantly correlated with that of the overall cognition and memory test. Conclusion : Information processing depending on aging or cognitive ability and the differential performance of processing speed could be observed through CRT. The performance of this test was found to be significantly correlated with that of the overall cognition and memory test. Therefore, CRT could be used meaningfully as a simplified tool to predict the initial cognitive disorder of the elderly in the community.

Increase of Spoken Number of Syllables Using MIT(Melody Intonation Therapy) : Case Studies on older adult with stroke and aphasia (MIT(Melodic Intonation Therapy) 중심의 음악활동을 이용한 실어증을 가진 뇌졸중 노인의 음절 수 증가에 대한 사례 연구)

  • Hong, Do Kyoung
    • Journal of Music and Human Behavior
    • /
    • v.2 no.2
    • /
    • pp.57-67
    • /
    • 2005
  • Most of stroke patients have not only physical difficulty but speech and neurological disorder because of hemiplegia and such unexpected changes cause psychologic disadaptability and absent-mindedness. Particularly, lowering of physical ability can lead to serious emotional problem from failure or frustration in daily life. Generally, treatment of patient with stroke put emphasis on physical rehabilitation but actually this patient had considerable speech disorder such as aphasia or articulation disorder. Moreover, failing of recognition function, mental disorder as hypochondria, and even visual and auditory disorder are represented. So it is effective to integrate verbal remediation and other treatments in medical care environment. In particular, patients with language disorder very often wither psychologically therefore it is efficient to use of music therapy that gives opulent emotion to aphasia patients. And primarily to investigate the effects of 10 sessions treatments; change in spoken total number of syllables, to confirm their own value by success of given task and reassure about themselves ability. All of 10 sessions stages were scored by MIT manual and its improvement were measured, that is, accomplishment was analyzed within each level in order to prove detail change of spoken total number of syllables. The result of this program organized from 2 syllables to 4 syllables is summarized as follows. Subject A completed in preliminary stage Level I, in 2 syllables case advanced to Level III in fifth session and to Level IV in seventh session, in 3 syllables case advanced to Level III in seventh session and to Level IV in ninth session, and in 4 syllables case showed 8% low success rate in first session but after repeated practice increased considerably in sixth session and in advanced to Level III in eighth session to Level IV in tenth session. Subject B also completed in preliminary stage Level I, in 2 syllables case advanced to Level III in forth session and to Level IV in sixth session, in 3 syllables case advanced to Level III in fifth session and to Level IV in seventh session, and in 4 syllables case showed 10% low success rate in first session and increased considerably in fifth session and in advanced to Level III in seventh session but could not reach to Level IV until tenth session. As a result, it was shown that music therapy using MIT was not statistically meaningful but improved spoken total number of syllables and success rate of task had improved as a whole. Therefore, music intervention using MIT it has positive affect on verbal ability of patients with Broca's Aphasia and their language rehabilitation.

  • PDF

A Study on Spoken Digits Analysis and Recognition (숫자음 분석과 인식에 관한 연구)

  • 김득수;황철준
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.6 no.3
    • /
    • pp.107-114
    • /
    • 2001
  • This paper describes Connected Digit Recognition with Considering Acoustic Feature in Korea. The recognition rate of connected digit is usually lower than word recognition. Therefore, speech feature parameter and acoustic feature are employed to make robust model for digit, and we could confirm the effect of Considering. Acoustic Feature throughout the experience of recognition. We used KLE 4 connected digit as database and 19 continuous distributed HMM as PLUs(Phoneme Like Units) using phonetical rules. For recognition experience, we have tested two cases. The first case, we used usual method like using Mel-Cepstrum and Regressive Coefficient for constructing phoneme model. The second case, we used expanded feature parameter and acoustic feature for constructing phoneme model. In both case, we employed OPDP(One Pass Dynamic Programming) and FSA(Finite State Automata) for recognition tests. When appling FSN for recognition, we applied various acoustic features. As the result, we could get 55.4% recognition rate for Mel-Cepstrum, and 67.4% for Mel-Cepstrum and Regressive Coefficient. Also, we could get 74.3% recognition rate for expanded feature parameter, and 75.4% for applying acoustic feature. Since, the case of applying acoustic feature got better result than former method, we could make certain that suggested method is effective for connected digit recognition in korean.

  • PDF

Arithmetic Fluctuation Effect affected by Induced Emotional Valence (유발된 정서가에 따른 계산 요동의 효과)

  • Kim, Choong-Myung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.2
    • /
    • pp.185-191
    • /
    • 2018
  • This study examined the type and extent of interruption between induced emotion and succeeding arithmetic operation. The experiment was carried out to determine the influence of the induced emotions (anger, joy, and sorrow) and stimulus types (picture and sentence) on the cognitive process load that may block the interactions among the constituents of working memory. The study subjects were 32 undergraduates who were similar with respect to age and education parameters and were especially instructed to attend to induced emotion by imitation of facial expression and to make a correct decision during the remainder calculation task. In the results, the stimulus types did not exhibit any difference but there was a significant difference among the induced emotion types. The difference was observed in slower response time at positive emotion(joy condition) as compared with other emotions(anger and sorrow). More specifically, error and delayed correct response rate for emotion types were analysed to determine which phase the slower response was associated with. Delayed responses of the joy condition by sentence-inducing stimulus were identified with the error rate difference, and those by picture-inducing stimulus with the delayed correct response rate. These findings not only suggest that induced positive emotion increased response time compared to negative emotions, but also imply that picture-inducing stimulus easily affords arithmetic fluctuation whereas sentence-inducing stimulus results in arithmetic failure.

Derivation of Asymptotic Formulas for the Signal-to-Noise Ratio of Mismatched Optimal Laplacian Quantizers (불일치된 최적 라플라스 양자기의 신호대잡음비 점근식의 유도)

  • Na, Sang-Sin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.5C
    • /
    • pp.413-421
    • /
    • 2008
  • The paper derives asymptotic formulas for the MSE distortion and the signal-to-noise ratio of a mismatched fixed-rate minimum MSE Laplacian quantizer. These closed-form formulas are expressed in terms of the number N of quantization points, the mean displacement $\mu$, and the ratio $\rho$ of the standard deviation of the source to that for which the quantizer is optimally designed. Numerical results show that the principal formula is accurate in that, for rate R=$log_2N{\geq}6$, it predicts signal-to-noise ratios within 1% of the true values for a wide range of $\mu$, and $\rho$. The new findings herein include the fact that, for heavy variance mismatch of ${\rho}>3/2$, the signal-to-noise ratio increases at the rate of $9/\rho$ dB/bit, which is slower than the usual 6 dB/bit, and the fact that an optimal uniform quantizer, though optimally designed, is slightly more than critically mismatched to the source. It is also found that signal-to-noise ratio loss due to $\mu$ is moderate. The derived formulas can be useful in quantization of speech or music signals, which are modeled well as Laplacian sources and have changing short-term variances.

A study on the lip shape recognition algorithm using 3-D Model (3차원 모델을 이용한 입모양 인식 알고리즘에 관한 연구)

  • 배철수
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.3 no.1
    • /
    • pp.59-68
    • /
    • 1999
  • Recently, research and developmental direction of communication system is concurrent adopting voice data and face image in speaking to provide more higher recognition rate then in the case of only voice data. Therefore, we present a method of lipreading in speech image sequence by using the 3-D facial shape model. The method use a feature information of the face image such as the opening-level of lip, the movement of jaw, and the projection height of lip. At first, we adjust the 3-D face model to speeching face image sequence. Then, to get a feature information we compute variance quantity from adjusted 3-D shape model of image sequence and use the variance quality of the adjusted 3-D model as recognition parameters. We use the intensity inclination values which obtaining from the variance in 3-D feature points as the separation of recognition units from the sequential image. After then, we use discrete HMM algorithm at recognition process, depending on multiple observation sequence which considers the variance of 3-D feature point fully. As a result of recognition experiment with the 8 Korean vowels and 2 Korean consonants, we have about 80% of recognition rate for the plosives and vowels. We propose that usability with visual distinguishing factor that using feature vector because as a result of recognition experiment for recognition parameter with the 10 korean vowels, obtaining high recognition rate.

  • PDF

Role of Animal Agriculture for the Quality of Human Life in the 21st Century - Review (Keynote Speech) -

  • Han, In K.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.12 no.5
    • /
    • pp.815-836
    • /
    • 1999
  • The role of animal agriculture for the quality of human life has always been emphasized during 20th century and it is expected to be even more important in terms of food supplies and in providing additional functions in the future. The world human population has almost tripled during a period of half century. The world population of animals has increased 2~3 times (6 times for chicken) during the last 60 years, and the total amount of livestock products has increased 5~6 times (more than 10 times in pork) with higher annual growth rate (9%) in developing countries. Increased personal income certainly encouraged demand for animal products over grains and lower animal production costs resulted from scientific and technological advances. Similarly the production of total grains has more than doubled owing to the advances in agricultural science during the later part of the 20th century. The average life span of world people in 1950s was only 46 years, which will be increased to almost 66 years in the year 2000. Present date clearly indicate that the life span of people is proportional to their income (GNP) and/or animal protein intake. Animals can provide other resources than foods. The increase of human population indicates that the number of animals as well as per capita consumption of animal products will be increased in the 21st century. The other resources we get from animals are drafts, packing, riding, hunting and herding. Guiding the blind, protection and companionship are also examples of what we can expect from animals. In the very near future, animals will become major donors of organs, skin and producers of drugs or special functional foods. It may be concluded that animals are very closely associated and related to the quality of human life, and they are expected to remain the same way in the 21st century.

A Study on the Weight Allocation Method of Humanist Input Value and Multiplex Modality using Tacit Data (암묵 데이터를 활용한 인문학 인풋값과 다중 모달리티의 가중치 할당 방법에 관한 연구)

  • Lee, Won-Tae;Kang, Jang-Mook
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.157-163
    • /
    • 2014
  • User's sensitivity is recognized as a very important parameter for communication between company, government and personnel. Especially in many studies, researchers use voice tone, voice speed, facial expression, moving direction and speed of body, and gestures to recognize the sensitivity. Multiplex modality is more precise than single modality however it has limited recognition rate and overload of data processing according to multi-sensing also an excellent algorithm is needed to deduce the sensing value. That is as each modality has different concept and property, errors might be happened to convert the human sensibility to standard values. To deal with this matter, the sensibility expression modality is needed to be extracted using technologies like analyzing of relational network, understanding of context and digital filter from multiplex modality. In specific situation to recognize the sensibility if the priority modality and other surrounding modalities are processed to implicit values, a robust system can be composed in comparison to the consuming of computer resource. As a result of this paper, it is proposed how to assign the weight of multiplex modality using implicit data.

Vocabulary Recognition Performance Improvement using a convergence of Bayesian Method for Parameter Estimation and Bhattacharyya Algorithm Model (모수 추정을 위한 베이시안 기법과 바타차랴 알고리즘을 융합한 어휘 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.353-358
    • /
    • 2015
  • The Vocabulary Recognition System made by recognizing the standard vocabulary is seen as a decline of recognition when out of the standard or similar words. In this case, reconstructing the system in order to add or extend a range of vocabulary is a way to solve the problem. This paper propose configured Bhattacharyya algorithm standing by speech recognition learning model using the Bayesian methods which reflect parameter estimation upon the model configuration scalability. It is recognized corrected standard model based on a characteristic of the phoneme using the Bayesian methods for parameter estimation of the phoneme's data and Bhattacharyya algorithm for a similar model. By Bhattacharyya algorithm to configure recognition model evaluates a recognition performance. The result of applying the proposed method is showed a recognition rate of 97.3% and a learning curve of 1.2 seconds.