Development and validation of a Korean Affective Voice Database

Kim, Yeji;Song, Hyesun;Jeon, Yesol;Oh, Yoorim;Lee, Youngmee;

doi:10.13064/KSSS.2022.14.3.077

Phonetics and Speech Sciences (말소리와 음성과학)

Volume 14 Issue 3
/
Pages.77-86
/
2022
/
2005-8063(pISSN)
/
2586-5854(eISSN)

Korean Society of Speech Sciences (한국음성학회)

DOI QR Code

Development and validation of a Korean Affective Voice Database

한국형 감정 음성 데이터베이스 구축을 위한 타당도 연구

Kim, Yeji (Department of Communication Disorders, Ewha Womans University) ;
Song, Hyesun (Department of Communication Disorders, Ewha Womans University) ;
Jeon, Yesol (Department of Communication Disorders, Ewha Womans University) ;
Oh, Yoorim (Department of Communication Disorders, Ewha Womans University) ;
Lee, Youngmee (Department of Communication Disorders, Ewha Womans University)

김예지 (이화여자대학교 언어병리학과) ;
송혜선 (이화여자대학교 언어병리학과) ;
전예솔 (이화여자대학교 언어병리학과) ;
오유림 (이화여자대학교 언어병리학과) ;
이영미 (이화여자대학교 언어병리학과)

Received : 2022.07.27
Accepted : 2022.08.15
Published : 2022.09.30

https://doi.org/10.13064/KSSS.2022.14.3.077 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this study, we reported the validation results of the Korean Affective Voice Database (KAV DB), an affective voice database available for scientific and clinical use, comprising a total of 113 validated affective voice stimuli. The KAV DB includes audio-recordings of two actors (one male and one female), each uttering 10 semantically neutral sentences with the intention to convey six different affective states (happiness, anger, fear, sadness, surprise, and neutral). The database was organized into three separate voice stimulus sets in order to validate the KAV DB. Participants rated the stimuli on six rating scales corresponding to the six targeted affective states by using a 100 horizontal visual analog scale. The KAV DB showed high internal consistency for voice stimuli (Cronbach's α=.847). The database had high sensitivity (mean=82.8%) and specificity (mean=83.8%). The KAV DB is expected to be useful for both academic research and clinical purposes in the field of communication disorders. The KAV DB is available for download at https://kav-db.notion.site/KAV-DB-75 39a36abe2e414ebf4a50d80436b41a.

본 연구는 운율을 기반으로 감정을 인식하는 능력을 측정할 때 이용할 수 있는 한국형 감정 음성 데이터베이스(Korean Affective Voice Database, 이하 KAV DB)를 개발하고, 해당 DB가 의사소통장애 관련 임상과 연구에서 활용될 수 있는지를 점검하기 위하여 신뢰도, 민감도, 특이도를 산출하여 그 타당성을 살펴보았다. 본 연구에서는 배우 2명(남 1명, 여 1명)이 의미적으로 중립적인 문장 10개를 행복, 분노, 공포, 슬픔, 놀람, 중립의 6개 정서로 표현하도록 하여 음성을 녹음하였다. 녹음된 음성 중에서 목표 정서가 잘 표현된 문장을 선별하여 타당도 점검을 위한 음성파일 세트를 구성하였으며, 청자 31명(남 14명, 여 17명)이 시각적 아날로그 평정법을 이용하여 각 음성에서 6개의 정서가 얼마나 반영되었는지 평정하도록 하였다. 그 결과, KAV DB에 대한 청자의 내적 일관성 신뢰도는 .872, 전체 민감도 82.8%, 전체 특이도 83.8%였다. 이를 통하여, 타당도가 확보된 KAV DB는 감정 음성 인식 및 산출과 관련된 연구와 임상 콘텐츠 제작 등에 활용될 수 있을 것으로 기대된다.

Keywords

References

Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614-636. https://doi.org/10.1037/0022-3514.70.3.614
Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods, 40(2), 531-539. https://doi.org/10.3758/BRM.40.2.531
Carstensen, L. L., Isaacowitz, D. M., & Charles, S. T. (1999). Taking time seriously: A theory of socioemotional selectivity. American Psychologist, 54(3), 165-181. https://doi.org/10.1037/0003-066X.54.3.165
Choi, Y. G., & Lee, J. Y. (2009). A study of the emotional recognition of children with cochlear implants. Journal of Speech-Language and Hearing Disorders, 18(3), 137-150. https://doi.org/10.15724/JSLHD.2009.18.3.009009
Coutinho, E., & Dibben, N. (2013). Psychoacoustic cues to emotion in speech prosody and music. Cognition and Emotion, 27(4), 658-684. https://doi.org/10.1080/02699931.2012.732559
Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Palo Alto, CA: Consulting Psychologists Press.
Ekman, P., Friesen, W. V., O'Sullivan, M., Chan, A., Diacoyanni-Tarlatzis, I., Heider, K., Krause, R., ... Tzavaras, A. (1987). Universals and cultural differences in the judgments of facial expressions of emotion. Journal of Personality and Social Psychology, 53(4), 712-717. https://doi.org/10.1037/0022-3514.53.4.712
Elfenbein, H. A., & Ambady, N. (2002). On the universality and cultural specificity of emotion recognition: A meta-analysis. Psychological Bulletin, 128(2), 203-235. https://doi.org/10.1037/0033-2909.128.2.203
Filipe, M. G., Frota, S., Castro, S. L., & Vicente, S. G. (2014). Atypical prosody in Asperger syndrome: Perceptual and acoustic measurements. Journal of Autism and Developmental Disorders, 44(8), 1972-1981. https://doi.org/10.1007/s10803-014-2073-2
Grandjean, D., Sander, D., Pourtois, G., Schwartz, S., Seghier, M. L., Scherer, K. R., & Vuilleumier, P. (2005). The voices of wrath: Brain responses to angry prosody in meaningless speech. Nature Neuroscience, 8(2), 145-146. https://doi.org/10.1038/nn1392
Imaizumi, S., Mori, K., Kiritani, S., Kawashima, R., Sugiura, M., Fukuda, H., Itoh, K., ... Nakamura, K. (1997). Vocal identification of speaker and emotion activates different brain regions. NeuroReport, 8(12), 2809-2812. https://doi.org/10.1097/00001756-199708180-00031
Kang, E. J., Hwang, M., & Jeong, M. (2014). Emotional recognizing ability from the prosodies of children with high-functioning autism. Journal of Emotional and Behavioral Disorders, 30(3), 79-94.
Kim, C. H., Kim, Y. T., & Lee, S. J. (2013). Effect of context and affective prosody on emotional perception in children with high-functioning autism. Communication Sciences and Disorders, 18(1), 24-34. https://doi.org/10.12963/csd.13003
Kitayama, S., & Ishii, K. (2002). Word and voice: Spontaneous attention to emotional utterances in two languages. Cognition and Emotion, 16(1), 29-59. https://doi.org/10.1080/0269993943000121
Koeda, M., Belin, P., Hama, T., Masuda, T., Matsuura, M., & Okubo, Y. (2013). Cross-cultural differences in the processing of nonverbal affective vocalizations by Japanese and Canadian listeners. Frontiers in Psychology, 4, 105.
Kotz, S. A., Meyer, M., Alter, K., Besson, M., von Cramon, D. Y., & Friederici, A. D. (2003). On the lateralization of emotional prosody: An event-related functional MR investigation. Brain and Language, 86(3), 366-376. https://doi.org/10.1016/S0093-934X(02)00532-1
Kwon, C., Song, S., Kim, J., Kim, K., & Jang, J. (2012). Extraction of speech features for emotion recognition. Phonetics and Speech Sciences, 4(2), 73-78. https://doi.org/10.13064/KSSS.2012.4.2.073
Laukka, P. (2005). Categorical perception of vocal emotion expressions. Emotion, 5(3), 277-295. https://doi.org/10.1037/1528-3542.5.3.277
Laukka, P., & Elfenbein, H. A. (2021). Cross-cultural emotion recognition and in-group advantage in vocal expression: A meta-analysis. Emotion Review, 13(1), 3-11. https://doi.org/10.1177/1754073919897295
Laukka, P., Elfenbein, H. A., Thingujam, N. S., Rockstuhl, T., Iraki, F. K., Chui, W., & Althoff, J. (2016). The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features. Journal of Personality and Social Psychology, 111(5), 686-705. https://doi.org/10.1037/pspi0000066
Lee, S. J., & Kim, Y. T. (2013). Review of prosodic ability in children and adolescents with autism spectrum disorder. Journal of the Korean Association for Persons with Autism, 13(1), 47-71.
Ma, X. (2012). A study on the effective acquisition method of rhythm for the dialect performance:Based on the cases of the command guidance of hamgyeong-do dialect for the characters (Master's thesis). Korea National University of Arts, Seoul, Korea.
Oerlemans, A. M., van der Meer, J. M. J., van Steijn, D. J., de Ruiter, S. W., de Bruijn, Y. G. E., de Sonneville, L. M. J., Buitelaar, J. K., ... Rommelse, N. N. J. (2014). Recognition of facial emotion and affective prosody in children with ASD (+ADHD) and their unaffected siblings. European Child and Adolescent Psychiatry, 23(5), 257-271. https://doi.org/10.1016/j.chc.2013.12.004
Park, C. O. (2010). The development of emotion reading ability in prosody of language (Master's thesis). Chungbuk National University, Cheongju, Korea.
Park, S. H., Kim, J. Y., & Park, S. Y. (2016). A literature review of prosody in people with communication disorders. Journal of Speech-Language and Hearing Disorders, 25(3), 155-171. https://doi.org/10.15724/JSLHD.2016.25.3.012012
Peng, S. C., Tomblin, J. B., & Turner, C. W. (2008). Production and perception of speech intonation in pediatric cochlear implant recipients and individuals with normal hearing. Ear and Hearing, 29(3), 336-351. https://doi.org/10.1097/AUD.0b013e318168d94d
Planalp, S. (1996). Varieties of cues to emotion in naturally occurring situations. Cognition and Emotion, 10(2), 137-154. https://doi.org/10.1080/026999396380303
Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross-Cultural Psychology, 32(1), 76-92. https://doi.org/10.1177/0022022101032001009
Seo, S. (2021). 2021 Version of basic research for lexical grading and selection of basic vocabulary in Korean. Seoul, Korea: National Institute of Korean Language.
Shin, H. B., Choi, J., Kim, K., & Lee, Y. (2015). Expressive prosody in autism spectrum disorders: Meta-analysis. Communication Sciences and Disorders, 20(3), 424-434. https://doi.org/10.12963/csd.15256
Sim, H. S. (2007). Physiological processing of prosody in language disordered people: A review of the literature. Special Education Research, 6(2), 129-144. https://doi.org/10.18541/ser.2007.10.6.2.129
Tanaka, A., Koizumi, A., Imai, H., Hiramatsu, S., Hiramoto, E., & de Gelder, B. (2010). I feel your voice: Cultural differences in the multisensory perception of emotion. Psychological Science, 21(9), 1259-1262. https://doi.org/10.1177/0956797610380698
Trainor, L. J., Austin, C. M., & Desjardins, R. N. (2000). Is infant-directed speech prosody a result of the vocal expression of emotion? Psychological Science, 11(3), 188-195. https://doi.org/10.1111/1467-9280.00240
Willcox, G. (1982). The feeling wheel: A tool for expanding awareness of emotions and increasing spontaneity and intimacy. Transactional Analysis Journal, 12(4), 274-276. https://doi.org/10.1177/036215378201200411

Phonetics and Speech Sciences (말소리와 음성과학)

Development and validation of a Korean Affective Voice Database

한국형 감정 음성 데이터베이스 구축을 위한 타당도 연구

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)