• Title/Summary/Keyword: Speech Understanding

Search Result 190, Processing Time 0.02 seconds

Study on Korean Proverb Comprehension in Patients with Right Hemisphere Damage (우리말 속담에 대한 우반구 손상 환자의 이해력 연구)

  • Ahn, Jong-Bok
    • Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.67-78
    • /
    • 2008
  • This study was intended to analyze how well people with right hemisphere damage can understand Korean proverbs. This study used a between-subject design and within-subjects design where all subjects participate in the three experiments for 10 hospitalized patients of whose average age is 66.1 years old and 10 normal adults in the same age range. Experiment 1 was to make up Korean proverb related stories and suggest it in written, Experiment II was to suggest a picture presenting a Korean proverb and the proverb itself directly, and Experiment III was to make up a story related to a Korean proverb and tell it auditory. Five Korea proverbs were used for each experiment; 15 in total. The results were as follows: First, there was a significant difference in understanding of Korean proverbs between normal adults and people with right hemisphere damage. Second, there was no significant difference of understanding between them by stimulating methods to suggest Korean proverbs. Third, people with right hemisphere damage had a tendency to interpret Korean proverbs literally as a total wrong reaction was composed of 55.36%. Considering the above the results, it can be said that people with right hemisphere damage have less understanding of Korean proverbs than normal adults, which is not affected by a way of or a condition of suggesting Korean proverbs in people with RHD, and they are inclined to make literal interpretation of Korean proverbs as they are.

  • PDF

A Multi-Strategic Concept-Spotting Approach for Robust Understanding of Spoken Korean

  • Lee, Chang-Ki;Eun, Ji-Hyun;Jeong, Min-Woo;Lee, Gary Geun-Bae;Hwang, Yi-Gyu;Jang, Myung-Gil
    • ETRI Journal
    • /
    • v.29 no.2
    • /
    • pp.179-188
    • /
    • 2007
  • We propose a multi-strategic concept-spotting approach for robust spoken language understanding of conversational Korean in a hostile recognition environment such as in-car navigation and telebanking services. Our concept-spotting method adopts a partial semantic understanding strategy within a given specific domain since the method tries to directly extract predefined meaning representation slot values from spoken language inputs. In spite of partial understanding, we can efficiently acquire the necessary information to compose interesting applications because the meaning representation slots are properly designed for specific domain-oriented understanding tasks. We also propose a multi-strategic method based on this concept-spotting approach such as a voting method. We present experiments conducted to verify the feasibility of these methods using a variety of spoken Korean data.

  • PDF

A Study on the Multilingual Speech Recognition using International Phonetic Language (IPA를 활용한 다국어 음성 인식에 관한 연구)

  • Kim, Suk-Dong;Kim, Woo-Sung;Woo, In-Sung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.7
    • /
    • pp.3267-3274
    • /
    • 2011
  • Recently, speech recognition technology has dramatically developed, with the increase in the user environment of various mobile devices and influence of a variety of speech recognition software. However, for speech recognition for multi-language, lack of understanding of multi-language lexical model and limited capacity of systems interfere with the improvement of the recognition rate. It is not easy to embody speech expressed with multi-language into a single acoustic model and systems using several acoustic models lower speech recognition rate. In this regard, it is necessary to research and develop a multi-language speech recognition system in order to embody speech comprised of various languages into a single acoustic model. This paper studied a system that can recognize Korean and English as International Phonetic Language (IPA), based on the research for using a multi-language acoustic model in mobile devices. Focusing on finding an IPA model which satisfies both Korean and English phonemes, we get 94.8% of the voice recognition rate in Korean and 95.36% in English.

Deep Level Situation Understanding for Casual Communication in Humans-Robots Interaction

  • Tang, Yongkang;Dong, Fangyan;Yoichi, Yamazaki;Shibata, Takanori;Hirota, Kaoru
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.1
    • /
    • pp.1-11
    • /
    • 2015
  • A concept of Deep Level Situation Understanding is proposed to realize human-like natural communication (called casual communication) among multi-agent (e.g., humans and robots/machines), where the deep level situation understanding consists of surface level understanding (such as gesture/posture understanding, facial expression understanding, speech/voice understanding), emotion understanding, intention understanding, and atmosphere understanding by applying customized knowledge of each agent and by taking considerations of thoughtfulness. The proposal aims to reduce burden of humans in humans-robots interaction, so as to realize harmonious communication by excluding unnecessary troubles or misunderstandings among agents, and finally helps to create a peaceful, happy, and prosperous humans-robots society. A simulated experiment is carried out to validate the deep level situation understanding system on a scenario where meeting-room reservation is done between a human employee and a secretary-robot. The proposed deep level situation understanding system aims to be applied in service robot systems for smoothing the communication and avoiding misunderstanding among agents.

The Effect of Signal-to-Noise Ratio on Sentence Recognition Performance in Pre-school Age Children with Hearing Impairment (청각장애 유소아의 신호대소음비에 따른 문장인지 능력)

  • Lee, Mi-Sook
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.117-123
    • /
    • 2011
  • Most individuals with hearing impairment have difficulty in understanding speech in noisy situations. This study was conducted to investigate sentence recognition ability using the Korean Standard-Sentence Lists for Preschoolers (KS-SL-P2) in pre-school age children with cochlear implants and hearing aids. The subjects of this study were 10 pre-school age children with hearing aids, 12 pre-school age children with cochlear implants, and 10 pre-school age children with normal hearing. Three kinds of signal-to-noise (SNR) conditions (+10 dB, +5 dB, 0 dB) were applied. The results for all pre-school age children with cochlear implants and hearing aids presented a significant increase in the score for sentence recognition as SNR increased. The sentence recognition score in speech noise were obtained with the SNR +10 dB. Significant differences existed between groups in terms of their sentence recognition ability, with the cochlear implant group performing better than the hearing aid group. These findings suggest the presence of a sentence recognition test using speech noise is useful for evaluating pre-school age children's listening skill.

  • PDF

The Role of Cognitive Control in Tinnitus and Its Relation to Speech-in-Noise Performance

  • Tai, Yihsin;Husain, Fatima T.
    • Journal of Audiology & Otology
    • /
    • v.23 no.1
    • /
    • pp.1-7
    • /
    • 2019
  • Self-reported difficulties in speech-in-noise (SiN) recognition are common among tinnitus patients. Whereas hearing impairment that usually co-occurs with tinnitus can explain such difficulties, recent studies suggest that tinnitus patients with normal hearing sensitivity still show decreased SiN understanding, indicating that SiN difficulties cannot be solely attributed to changes in hearing sensitivity. In fact, cognitive control, which refers to a variety of top-down processes that human beings use to complete their daily tasks, has been shown to be critical for SiN recognition, as well as the key to understand cognitive inefficiencies caused by tinnitus. In this article, we review studies investigating the association between tinnitus and cognitive control using behavioral and brain imaging assessments, as well as those examining the effect of tinnitus on SiN recognition. In addition, three factors that can affect cognitive control in tinnitus patients, including hearing sensitivity, age, and severity of tinnitus, are discussed to elucidate the association among tinnitus, cognitive control, and SiN recognition. Although a possible central or cognitive involvement has always been postulated in the observed SiN impairments in tinnitus patients, there is as yet no direct evidence to underpin this assumption, as few studies have addressed both SiN performance and cognitive control in one tinnitus cohort. Future studies should aim at incorporating SiN tests with various subjective and objective methods that evaluate cognitive performance to better understand the relationship between SiN difficulties and cognitive control in tinnitus patients.

A Study on Refusal Speech Act of Korean and Thai Learners from a Cross-Cultural Pragmatic Perspective (비교문화적 화용론의 관점에서 본 한국인과 태국인의 거절 화행 연구)

  • Hwang, Sunyoung;Noh, Ahsil;Kunghae, Samawadee
    • Journal of Korean language education
    • /
    • v.29 no.4
    • /
    • pp.225-254
    • /
    • 2018
  • The purpose of this study is to contrast the patterns of realization and understanding of refusal speech acts between Korean and Thai learners. This study intends to answer the following questions: (1) Do Koreans and Thai learners perform refusal speech acts differently? (2) Do Koreans and Thai learners understand refusal speech acts differently? A DCT and a follow-up interview were conducted to collect data of two groups of 30 native Korean speakers and 30 native Thai speakers. For research question 1, we analyzed the refusal strategy and provided reasons given by Koreans and Thai learners depending on the context. For research question 2, we ran a chi-squared test on the elements of the follow-up interviews, such as the weight of burden of refusing, and whether the participant would actually refuse or not. The differences between the refusal strategies of the two groups could be categorized by the preceding inducing speech act. In refusing a request, the difference was prominent in the apologizing strategy, whereas in refusing a suggestion, the difference was mainly in the direct refusal strategy. When refusing an invitation, the most evident difference was the number of refusal strategies employed. When providing an explanation of refusal to people with high social status, Koreans gave more specific reasons for refusals, whereas Thai learners tended to use more vague reasons. Moreover, when refusing an invitation, Koreans primarily mentioned the relationship, and Thai learners showed the spirit of Greng Jai. When asked the weight of burden of refusing, Koreans felt pressured to refuse a request from people with high social status, and a suggestion or invitation from people with high level of intimacy while Thai learners found it highly difficult to make a refusal in all cases. In answering whether they would actually refuse or not, Koreans tried not to make a refusal to people with high level of intimacy, and such a trend was not evident among the Thai. This study can help us better understand the learner's pragmatic failure, and serve as a basis in establishing a curriculum for teaching speech acts.

What You Hear is What You See\ulcorner

  • Moon, Seung-Jae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.1E
    • /
    • pp.31-41
    • /
    • 2002
  • This study aims at investigating the relationship between voice and the image information carried within the voice. Whenever we hear somebody talking, we form a mental image of the speaker. Is it accurate? Is there a relationship between the voice and the image triggered by the voice? To answer these questions, speech samples form 8 males and 8 females were recorded. Two photos were taken for each speaker: the whole body photo (W) with physical characteristics present, and the face close-ups (F) without much physical details revealed. 361 subjects were asked to match the voices with the corresponding photos. The results showed that 5 males and 5 f3males (with W) and 2 males and 4 females (with F) were correctly identified. More interestingly, however, even in the mismatches, there was a strong tendency for participants to agree on which voice should correspond to which photo. The participants also agreed much more readily on their favorite voice than on their favorite photo. It seems voice does carry certain information about the physical characteristics of the speaker in a consistent manner. These findings have some bearings on understanding the mechanism of speech production and perception as well as on improving speech technology.

Modelling Duration In Text-to-Speech Systems

  • Chung Hyunsong
    • MALSORI
    • /
    • no.49
    • /
    • pp.159-174
    • /
    • 2004
  • The development of the durational component of prosody modelling was overviewed and discussed in text-to-speech conversion of spoken English and Korean, showing the strengths and weaknesses of each approach. The possibility of integrating linguistic feature effects into the duration modelling of TTS systems was also investigated. This paper claims that current approaches to language timing synthesis still require an understanding of how segmental duration is affected by context. Three modelling approaches were discussed: sequential rule systems, Classification and Regression Tree (CART) models and Sums-of-Products (SoP) models. The CART and SoP models show good performance results in predicting segment duration in English, while it is not the case in the SoP modelling of spoken Korean.

  • PDF

Statistical Korean Spoken Language Understanding System for Dialog Processing (대화처리를 위한 통계기반 한국어 음성언어이해 시스템)

  • Roh, Yoon-Hyung;Yang, Seong-II;Kim, Young-Gil
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.215-218
    • /
    • 2012
  • 본 논문에서는 한국어 대화 처리를 위한 통계기반 음성언어이해 시스템에 대해 기술한다. 음성언어이해시스템은 대화처리에서 음성 인식된 문장으로부터 사용자의 의도를 인식하여 의미표현으로 표현하는 기능을 담당한다. 한국어의 특성을 반영한 실용적인 음성언어이해 시스템을 위해서 강건성과 적용성, 확장성 등이 요구된다. 이를 위해 본 시스템은 음성언어의 특성상 구조분석을 하지 않고, 마이닝 기법을 이용하여 사용자 의도 표현을 생성하는 방식을 취하고 있다. 또한 한국어에서 나타나는 특징들에 대한 처리를 위해 자질 추가 및 점규화 처리 등을 수행하였다. 정보서비스용 대화처리 시스템을 대상으로 개발되고 있고, 차량 정보서비스용 학습 코퍼스를 대상으로 실험을 하여 문장단위 정확률로 약 89%의 성능을 보이고 있다.

  • PDF