• Title/Summary/Keyword: natural speech

Search Result 320, Processing Time 0.028 seconds

Speech Emotion Recognition by Speech Signals on a Simulated Intelligent Robot (모의 지능로봇에서 음성신호에 의한 감정인식)

  • Jang, Kwang-Dong;Kwon, Oh-Wook
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.163-166
    • /
    • 2005
  • We propose a speech emotion recognition method for natural human-robot interface. In the proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes pitch, jitter, duration, and rate of speech. Finally a patten classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5different directions. Experimental results show that the proposed method yields 59% classification accuracy while human classifiers give about 50%accuracy, which confirms that the proposed method achieves performance comparable to a human.

  • PDF

Conveyed Message in YouTube Product Review Videos: The discrepancy between sponsored and non-sponsored product review videos

  • Kim, Do Hun;Suh, Ji Hae
    • The Journal of Information Systems
    • /
    • v.32 no.4
    • /
    • pp.29-50
    • /
    • 2023
  • Purpose The impact of online reviews is widely acknowledged, with extensive research focused on text-based reviews. However, there's a lack of research regarding reviews in video format. To address this gap, this study aims to explore the connection between company-sponsored product review videos and the extent of directive speech within them. This article analyzed viewer sentiments expressed in video comments based on the level of directive speech used by the presenter. Design/methodology/approach This study involved analyzing speech acts in review videos based on sponsorship and examining consumer reactions through sentiment analysis of comments. We used Speech Act theory to perform the analysis. Findings YouTubers who receive company sponsorship for review videos tend to employ more directive speech. Furthermore, this increased use of directive speech is associated with a higher occurrence of negative consumer comments. This study's outcomes are valuable for the realm of user-generated content and natural language processing, offering practical insights for YouTube marketing strategies.

Occupational Performance of Hearing-Impaired and Normal-Hearing Workers in Korea

  • Kim, Jinsook;Shin, Yerim;Lee, Seungwan;Lee, Eunsung;Han, Woojae;Lee, Jihyeon
    • Journal of Audiology & Otology
    • /
    • v.25 no.4
    • /
    • pp.189-198
    • /
    • 2021
  • Background and Objectives: This study aimed to investigate the occupational performance of Korean workers with and without hearing loss and analyze the hearing-related difficulties in the working environment. Subjects and Methods: The Amsterdam checklist for hearing and work was used for the analyses and the occupational environments of the Korean workers were investigated. Out of 129 total participants, 86 workers experienced severe to profound hearing loss and 43 had the normal hearing ability. The hearing-impaired workers were recruited from two leading vocational centers and normal-hearing workers were their colleagues. Results: The hearing-impaired workers were found to take fewer sick leaves and exhibited higher rates of permanent job statuses compared to the normal-hearing workers. Workers with hearing loss rarely detected background sound; however, they could perceive reverberation more frequently. They felt more satisfied with their careers than the normal hearing workers as they received social support and needed to put their effort into hearing for most hearing activities. Furthermore, the effort in hearing increased with the increase in job demand, job control, social support, and career satisfaction. The working hours per week increased with the increase in age, education level, job demand, job control, and social support. Different trends were observed in 9 out of 12 variables while comparing the data from the present study with that obtained from the hearing-impaired workers of the Netherlands, indicating a large difference between countries. Conclusions: Although the hearing-impaired Korean workers operate diligently with good job positions, it is necessary to enhance their acoustic environment and provide them social support. Considering the cultural background of the hearing-impaired workers, the development of suitable vocational rehabilitation programs and specific questionnaires is strongly recommended worldwide.

A Comparison Between the Korean Digits-in-Noise Test and the Korean Speech Perception-in-Noise Test in Normal-Hearing and Hearing-Impaired Listeners

  • Kim, Subin;You, Sungwha;Sohn, Myoung Eun;Han, Woojae;Seo, Jae-Hyun;Oh, Yonghee
    • Journal of Audiology & Otology
    • /
    • v.25 no.4
    • /
    • pp.171-177
    • /
    • 2021
  • Background and Objectives: The purpose of the present study was to validate the performance and diagnostic efficacy of the Korean digits-in-noise (K-DIN) test in comparison to the Korean speech perception-in-noise (K-SPIN) test, which is the representative speech-in-noise test in clinical practice. Subjects and Methods: Twenty-seven subjects (15 normal-hearing and 12 hearing-impaired listeners) participated. The recorded Korean 0-9 digits were used to form quasirandom digit triplets; 50 target digit triplets were presented at the most comfortable level of each subject while presenting speech-shaped background noise at various levels of signal-to-noise ratios (-12.5, -10, -5, or +5 dB). Subjects were then instructed to listen to both target and noise masker unilaterally and bilaterally through a headphone. K-SPIN test was also conducted using the same procedure as the K-DIN. After calculating their percent correct responses, K-DIN and K-SPIN results were compared using a Pearson-correlation test. Results: Results showed a statistically significant correlation between K-DIN and K-SPIN in all hearing conditions (left: r=0.814, p<0.001; right: r=0.788, p<0.001; bilateral: r=0.727, p<0.001). Moreover, the K-DIN test achieved better testing efficacy, shorter average listening time (5 min vs. 30 min), and easier performance of task according to participants' qualitative reports than the K-SPIN test. Conclusions: In this study, the Korean version of digit triplet test was validated in both normal-hearing and hearing-impaired listeners. The findings suggest that the K-DIN test can be used as a simple and time-efficient hearing-in-noise test in audiology clinics in Korea.

Occupational Performance of Hearing-Impaired and Normal-Hearing Workers in Korea

  • Kim, Jinsook;Shin, Yerim;Lee, Seungwan;Lee, Eunsung;Han, Woojae;Lee, Jihyeon
    • Korean Journal of Audiology
    • /
    • v.25 no.4
    • /
    • pp.189-199
    • /
    • 2021
  • Background and Objectives: This study aimed to investigate the occupational performance of Korean workers with and without hearing loss and analyze the hearing-related difficulties in the working environment. Subjects and Methods: The Amsterdam checklist for hearing and work was used for the analyses and the occupational environments of the Korean workers were investigated. Out of 129 total participants, 86 workers experienced severe to profound hearing loss and 43 had the normal hearing ability. The hearing-impaired workers were recruited from two leading vocational centers and normal-hearing workers were their colleagues. Results: The hearing-impaired workers were found to take fewer sick leaves and exhibited higher rates of permanent job statuses compared to the normal-hearing workers. Workers with hearing loss rarely detected background sound; however, they could perceive reverberation more frequently. They felt more satisfied with their careers than the normal hearing workers as they received social support and needed to put their effort into hearing for most hearing activities. Furthermore, the effort in hearing increased with the increase in job demand, job control, social support, and career satisfaction. The working hours per week increased with the increase in age, education level, job demand, job control, and social support. Different trends were observed in 9 out of 12 variables while comparing the data from the present study with that obtained from the hearing-impaired workers of the Netherlands, indicating a large difference between countries. Conclusions: Although the hearing-impaired Korean workers operate diligently with good job positions, it is necessary to enhance their acoustic environment and provide them social support. Considering the cultural background of the hearing-impaired workers, the development of suitable vocational rehabilitation programs and specific questionnaires is strongly recommended worldwide.

A Comparison Between the Korean Digits-in-Noise Test and the Korean Speech Perception-in-Noise Test in Normal-Hearing and Hearing-Impaired Listeners

  • Kim, Subin;You, Sungwha;Sohn, Myoung Eun;Han, Woojae;Seo, Jae-Hyun;Oh, Yonghee
    • Korean Journal of Audiology
    • /
    • v.25 no.4
    • /
    • pp.171-177
    • /
    • 2021
  • Background and Objectives: The purpose of the present study was to validate the performance and diagnostic efficacy of the Korean digits-in-noise (K-DIN) test in comparison to the Korean speech perception-in-noise (K-SPIN) test, which is the representative speech-in-noise test in clinical practice. Subjects and Methods: Twenty-seven subjects (15 normal-hearing and 12 hearing-impaired listeners) participated. The recorded Korean 0-9 digits were used to form quasirandom digit triplets; 50 target digit triplets were presented at the most comfortable level of each subject while presenting speech-shaped background noise at various levels of signal-to-noise ratios (-12.5, -10, -5, or +5 dB). Subjects were then instructed to listen to both target and noise masker unilaterally and bilaterally through a headphone. K-SPIN test was also conducted using the same procedure as the K-DIN. After calculating their percent correct responses, K-DIN and K-SPIN results were compared using a Pearson-correlation test. Results: Results showed a statistically significant correlation between K-DIN and K-SPIN in all hearing conditions (left: r=0.814, p<0.001; right: r=0.788, p<0.001; bilateral: r=0.727, p<0.001). Moreover, the K-DIN test achieved better testing efficacy, shorter average listening time (5 min vs. 30 min), and easier performance of task according to participants' qualitative reports than the K-SPIN test. Conclusions: In this study, the Korean version of digit triplet test was validated in both normal-hearing and hearing-impaired listeners. The findings suggest that the K-DIN test can be used as a simple and time-efficient hearing-in-noise test in audiology clinics in Korea.

A Study on Image Recommendation System based on Speech Emotion Information

  • Kim, Tae Yeun;Bae, Sang Hyun
    • Journal of Integrative Natural Science
    • /
    • v.11 no.3
    • /
    • pp.131-138
    • /
    • 2018
  • In this paper, we have implemented speeches that utilized the emotion information of the user's speech and image matching and recommendation system. To classify the user's emotional information of speech, the emotional information of speech about the user's speech is extracted and classified using the PLP algorithm. After classification, an emotional DB of speech is constructed. Moreover, emotional color and emotional vocabulary through factor analysis are matched to one space in order to classify emotional information of image. And a standardized image recommendation system based on the matching of each keyword with the BM-GA algorithm for the data of the emotional information of speech and emotional information of image according to the more appropriate emotional information of speech of the user. As a result of the performance evaluation, recognition rate of standardized vocabulary in four stages according to speech was 80.48% on average and system user satisfaction was 82.4%. Therefore, it is expected that the classification of images according to the user's speech information will be helpful for the study of emotional exchange between the user and the computer.

Speech Animation Synthesis based on a Korean Co-articulation Model (한국어 동시조음 모델에 기반한 스피치 애니메이션 생성)

  • Jang, Minjung;Jung, Sunjin;Noh, Junyong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.3
    • /
    • pp.49-59
    • /
    • 2020
  • In this paper, we propose a speech animation synthesis specialized in Korean through a rule-based co-articulation model. Speech animation has been widely used in the cultural industry, such as movies, animations, and games that require natural and realistic motion. Because the technique for audio driven speech animation has been mainly developed for English, however, the animation results for domestic content are often visually very unnatural. For example, dubbing of a voice actor is played with no mouth motion at all or with an unsynchronized looping of simple mouth shapes at best. Although there are language-independent speech animation models, which are not specialized in Korean, they are yet to ensure the quality to be utilized in a domestic content production. Therefore, we propose a natural speech animation synthesis method that reflects the linguistic characteristics of Korean driven by an input audio and text. Reflecting the features that vowels mostly determine the mouth shape in Korean, a coarticulation model separating lips and the tongue has been defined to solve the previous problem of lip distortion and occasional missing of some phoneme characteristics. Our model also reflects the differences in prosodic features for improved dynamics in speech animation. Through user studies, we verify that the proposed model can synthesize natural speech animation.

Speech Perception and Gap Detection Performance of Single-Sided Deafness under Noisy Conditions

  • Kwak, Chanbeom;Kim, Saea;Lee, Jihyeon;Seo, Youngjoon;Kong, Taehoon;Han, Woojae
    • Journal of Audiology & Otology
    • /
    • v.23 no.4
    • /
    • pp.197-203
    • /
    • 2019
  • Background and Objectives: Many studies have reported no benefit of sound localization, but improved speech understanding in noise after treating patients with single-sided deafness (SSD). Furthermore, their performances provided a large individual difference. The present study aimed to measure the ability of speech perception and gap detection in noise for the SSD patients to better understand their hearing nature. Subjects and Methods: Nine SSD patients with different onset and period of hearing deprivation and 20 young adults with normal hearing and simulated conductive hearing loss as the control groups conducted speech perception in noise (SPIN) and Gap-In-Noise (GIN) tests. The SPIN test asked how many presented sentences were understood at the +5 and -5 dB signal-to-noise ratio. The GIN test was asked to find the shortest gap in white noise with different lengths in the gap. Results: Compared to the groups with normal hearing and simulated instant hearing loss, the SSD group showed much poor performance in both SPIN and GIN tests while supporting central auditory plasticity of the SSD patients. Rather than a longer period of deafness, the large individual variance indicated that the congenital SSD patients showed better performance than the acquired SSD patients in two measurements. Conclusions: The results suggested that comprehensive assessments should be implemented before any treatment of the SSD patient considering their onset time and etiology, although these findings need to be generalized with a large sample size.

Speech Perception and Gap Detection Performance of Single-Sided Deafness under Noisy Conditions

  • Kwak, Chanbeom;Kim, Saea;Lee, Jihyeon;Seo, Youngjoon;Kong, Taehoon;Han, Woojae
    • Korean Journal of Audiology
    • /
    • v.23 no.4
    • /
    • pp.197-203
    • /
    • 2019
  • Background and Objectives: Many studies have reported no benefit of sound localization, but improved speech understanding in noise after treating patients with single-sided deafness (SSD). Furthermore, their performances provided a large individual difference. The present study aimed to measure the ability of speech perception and gap detection in noise for the SSD patients to better understand their hearing nature. Subjects and Methods: Nine SSD patients with different onset and period of hearing deprivation and 20 young adults with normal hearing and simulated conductive hearing loss as the control groups conducted speech perception in noise (SPIN) and Gap-In-Noise (GIN) tests. The SPIN test asked how many presented sentences were understood at the +5 and -5 dB signal-to-noise ratio. The GIN test was asked to find the shortest gap in white noise with different lengths in the gap. Results: Compared to the groups with normal hearing and simulated instant hearing loss, the SSD group showed much poor performance in both SPIN and GIN tests while supporting central auditory plasticity of the SSD patients. Rather than a longer period of deafness, the large individual variance indicated that the congenital SSD patients showed better performance than the acquired SSD patients in two measurements. Conclusions: The results suggested that comprehensive assessments should be implemented before any treatment of the SSD patient considering their onset time and etiology, although these findings need to be generalized with a large sample size.