Development and validation of a Korean Affective Voice Database

한국형 감정 음성 데이터베이스 구축을 위한 타당도 연구

  • Kim, Yeji (Department of Communication Disorders, Ewha Womans University) ;
  • Song, Hyesun (Department of Communication Disorders, Ewha Womans University) ;
  • Jeon, Yesol (Department of Communication Disorders, Ewha Womans University) ;
  • Oh, Yoorim (Department of Communication Disorders, Ewha Womans University) ;
  • Lee, Youngmee (Department of Communication Disorders, Ewha Womans University)
  • 김예지 (이화여자대학교 언어병리학과) ;
  • 송혜선 (이화여자대학교 언어병리학과) ;
  • 전예솔 (이화여자대학교 언어병리학과) ;
  • 오유림 (이화여자대학교 언어병리학과) ;
  • 이영미 (이화여자대학교 언어병리학과)
  • Received : 2022.07.27
  • Accepted : 2022.08.15
  • Published : 2022.09.30


In this study, we reported the validation results of the Korean Affective Voice Database (KAV DB), an affective voice database available for scientific and clinical use, comprising a total of 113 validated affective voice stimuli. The KAV DB includes audio-recordings of two actors (one male and one female), each uttering 10 semantically neutral sentences with the intention to convey six different affective states (happiness, anger, fear, sadness, surprise, and neutral). The database was organized into three separate voice stimulus sets in order to validate the KAV DB. Participants rated the stimuli on six rating scales corresponding to the six targeted affective states by using a 100 horizontal visual analog scale. The KAV DB showed high internal consistency for voice stimuli (Cronbach's α=.847). The database had high sensitivity (mean=82.8%) and specificity (mean=83.8%). The KAV DB is expected to be useful for both academic research and clinical purposes in the field of communication disorders. The KAV DB is available for download at 39a36abe2e414ebf4a50d80436b41a.

본 연구는 운율을 기반으로 감정을 인식하는 능력을 측정할 때 이용할 수 있는 한국형 감정 음성 데이터베이스(Korean Affective Voice Database, 이하 KAV DB)를 개발하고, 해당 DB가 의사소통장애 관련 임상과 연구에서 활용될 수 있는지를 점검하기 위하여 신뢰도, 민감도, 특이도를 산출하여 그 타당성을 살펴보았다. 본 연구에서는 배우 2명(남 1명, 여 1명)이 의미적으로 중립적인 문장 10개를 행복, 분노, 공포, 슬픔, 놀람, 중립의 6개 정서로 표현하도록 하여 음성을 녹음하였다. 녹음된 음성 중에서 목표 정서가 잘 표현된 문장을 선별하여 타당도 점검을 위한 음성파일 세트를 구성하였으며, 청자 31명(남 14명, 여 17명)이 시각적 아날로그 평정법을 이용하여 각 음성에서 6개의 정서가 얼마나 반영되었는지 평정하도록 하였다. 그 결과, KAV DB에 대한 청자의 내적 일관성 신뢰도는 .872, 전체 민감도 82.8%, 전체 특이도 83.8%였다. 이를 통하여, 타당도가 확보된 KAV DB는 감정 음성 인식 및 산출과 관련된 연구와 임상 콘텐츠 제작 등에 활용될 수 있을 것으로 기대된다.



  1. Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614-636.
  2. Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods, 40(2), 531-539.
  3. Carstensen, L. L., Isaacowitz, D. M., & Charles, S. T. (1999). Taking time seriously: A theory of socioemotional selectivity. American Psychologist, 54(3), 165-181.
  4. Choi, Y. G., & Lee, J. Y. (2009). A study of the emotional recognition of children with cochlear implants. Journal of Speech-Language and Hearing Disorders, 18(3), 137-150.
  5. Coutinho, E., & Dibben, N. (2013). Psychoacoustic cues to emotion in speech prosody and music. Cognition and Emotion, 27(4), 658-684.
  6. Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Palo Alto, CA: Consulting Psychologists Press.
  7. Ekman, P., Friesen, W. V., O'Sullivan, M., Chan, A., Diacoyanni-Tarlatzis, I., Heider, K., Krause, R., ... Tzavaras, A. (1987). Universals and cultural differences in the judgments of facial expressions of emotion. Journal of Personality and Social Psychology, 53(4), 712-717.
  8. Elfenbein, H. A., & Ambady, N. (2002). On the universality and cultural specificity of emotion recognition: A meta-analysis. Psychological Bulletin, 128(2), 203-235.
  9. Filipe, M. G., Frota, S., Castro, S. L., & Vicente, S. G. (2014). Atypical prosody in Asperger syndrome: Perceptual and acoustic measurements. Journal of Autism and Developmental Disorders, 44(8), 1972-1981.
  10. Grandjean, D., Sander, D., Pourtois, G., Schwartz, S., Seghier, M. L., Scherer, K. R., & Vuilleumier, P. (2005). The voices of wrath: Brain responses to angry prosody in meaningless speech. Nature Neuroscience, 8(2), 145-146.
  11. Imaizumi, S., Mori, K., Kiritani, S., Kawashima, R., Sugiura, M., Fukuda, H., Itoh, K., ... Nakamura, K. (1997). Vocal identification of speaker and emotion activates different brain regions. NeuroReport, 8(12), 2809-2812.
  12. Kang, E. J., Hwang, M., & Jeong, M. (2014). Emotional recognizing ability from the prosodies of children with high-functioning autism. Journal of Emotional and Behavioral Disorders, 30(3), 79-94.
  13. Kim, C. H., Kim, Y. T., & Lee, S. J. (2013). Effect of context and affective prosody on emotional perception in children with high-functioning autism. Communication Sciences and Disorders, 18(1), 24-34.
  14. Kitayama, S., & Ishii, K. (2002). Word and voice: Spontaneous attention to emotional utterances in two languages. Cognition and Emotion, 16(1), 29-59.
  15. Koeda, M., Belin, P., Hama, T., Masuda, T., Matsuura, M., & Okubo, Y. (2013). Cross-cultural differences in the processing of nonverbal affective vocalizations by Japanese and Canadian listeners. Frontiers in Psychology, 4, 105.
  16. Kotz, S. A., Meyer, M., Alter, K., Besson, M., von Cramon, D. Y., & Friederici, A. D. (2003). On the lateralization of emotional prosody: An event-related functional MR investigation. Brain and Language, 86(3), 366-376.
  17. Kwon, C., Song, S., Kim, J., Kim, K., & Jang, J. (2012). Extraction of speech features for emotion recognition. Phonetics and Speech Sciences, 4(2), 73-78.
  18. Laukka, P. (2005). Categorical perception of vocal emotion expressions. Emotion, 5(3), 277-295.
  19. Laukka, P., & Elfenbein, H. A. (2021). Cross-cultural emotion recognition and in-group advantage in vocal expression: A meta-analysis. Emotion Review, 13(1), 3-11.
  20. Laukka, P., Elfenbein, H. A., Thingujam, N. S., Rockstuhl, T., Iraki, F. K., Chui, W., & Althoff, J. (2016). The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features. Journal of Personality and Social Psychology, 111(5), 686-705.
  21. Lee, S. J., & Kim, Y. T. (2013). Review of prosodic ability in children and adolescents with autism spectrum disorder. Journal of the Korean Association for Persons with Autism, 13(1), 47-71.
  22. Ma, X. (2012). A study on the effective acquisition method of rhythm for the dialect performance:Based on the cases of the command guidance of hamgyeong-do dialect for the characters (Master's thesis). Korea National University of Arts, Seoul, Korea.
  23. Oerlemans, A. M., van der Meer, J. M. J., van Steijn, D. J., de Ruiter, S. W., de Bruijn, Y. G. E., de Sonneville, L. M. J., Buitelaar, J. K., ... Rommelse, N. N. J. (2014). Recognition of facial emotion and affective prosody in children with ASD (+ADHD) and their unaffected siblings. European Child and Adolescent Psychiatry, 23(5), 257-271.
  24. Park, C. O. (2010). The development of emotion reading ability in prosody of language (Master's thesis). Chungbuk National University, Cheongju, Korea.
  25. Park, S. H., Kim, J. Y., & Park, S. Y. (2016). A literature review of prosody in people with communication disorders. Journal of Speech-Language and Hearing Disorders, 25(3), 155-171.
  26. Peng, S. C., Tomblin, J. B., & Turner, C. W. (2008). Production and perception of speech intonation in pediatric cochlear implant recipients and individuals with normal hearing. Ear and Hearing, 29(3), 336-351.
  27. Planalp, S. (1996). Varieties of cues to emotion in naturally occurring situations. Cognition and Emotion, 10(2), 137-154.
  28. Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross-Cultural Psychology, 32(1), 76-92.
  29. Seo, S. (2021). 2021 Version of basic research for lexical grading and selection of basic vocabulary in Korean. Seoul, Korea: National Institute of Korean Language.
  30. Shin, H. B., Choi, J., Kim, K., & Lee, Y. (2015). Expressive prosody in autism spectrum disorders: Meta-analysis. Communication Sciences and Disorders, 20(3), 424-434.
  31. Sim, H. S. (2007). Physiological processing of prosody in language disordered people: A review of the literature. Special Education Research, 6(2), 129-144.
  32. Tanaka, A., Koizumi, A., Imai, H., Hiramatsu, S., Hiramoto, E., & de Gelder, B. (2010). I feel your voice: Cultural differences in the multisensory perception of emotion. Psychological Science, 21(9), 1259-1262.
  33. Trainor, L. J., Austin, C. M., & Desjardins, R. N. (2000). Is infant-directed speech prosody a result of the vocal expression of emotion? Psychological Science, 11(3), 188-195.
  34. Willcox, G. (1982). The feeling wheel: A tool for expanding awareness of emotions and increasing spontaneity and intimacy. Transactional Analysis Journal, 12(4), 274-276.