Browse > Article
http://dx.doi.org/10.15207/JKCS.2020.11.9.091

Research on Emotional Factors and Voice Trend by Country to be considered in Designing AI's Voice - An analysis of interview with experts in Finland and Norway  

Namkung, Kiechan (Industry Academic Cooperation Foundation, Kookmin University)
Publication Information
Journal of the Korea Convergence Society / v.11, no.9, 2020 , pp. 91-97 More about this Journal
Abstract
Use of voice-based interfaces that can interact with users is increasing as AI technology develops. To date, however, most of the research on voice-based interfaces has been technical in nature, focused on areas such as improving the accuracy of speech recognition. Thus, the voice of most voice-based interfaces is uniform and does not provide users with differentiated sensibilities. The purpose of this study is to add a emotional factor suitable for the AI interface. To this end, we have derived emotional factors that should be considered in designing voice interface. In addition, we looked at voice trends that differed from country to country. For this study, we conducted interviews with voice industry experts from Finland and Norway, countries that use their own independent languages.
Keywords
AI; Voice user interface; Emotional factor; Voice trend; Voice identity;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Zolnay, A., Kocharov, D., Schluter, R. & Ney, H. (2007). Using multiple acoustic feature sets for speech recognition. Speech Communication, 49(6), 514-525. DOI : 10.1016/j.specom.2007.04.005.   DOI
2 HeeEun, L. (2018). Why do voice-activated technologies sound female? Sound technology and gendered voice of digital voice assistants. Korean Journal of Communication & Information, 90, 126-153.   DOI
3 Nguyen, Q. N., Ta, A. & Prybutok, V. (2019). An integrated model of voice-user interface continuance intention: The gender effect. International Journal of Human-Computer Interaction, 35(15), 1362-1377. DOI : 10.1080/10447318.2018.1525023.   DOI
4 Mabanza, N. (2018, December). Gender influences on preference of pedagogical interface agents. Proceedings of the International Conference on Intelligent & Innovative Computing Applications , Plaine Magnien, Mauritius. DOI : 10.1109/ICONIC.2018.8601292.
5 Couper, M. P., Singer, E. & Tourangeau, R. (2004). Does voice matter? An interactive voice response (IVR) experiment. Journal of Official Statistics, 20(3), 551-570.
6 Myles, J. F. (2013). Instrumentalizing voice: Applying Bahktin and Bourdieu to analyze interactive voice response service. Journal of Communication Inquiry, 37(3), 233-248. DOI : 10.1177/0196859913491765.   DOI
7 Sydell, L. (2018). The push for a gender-neutral Siri. [Online]. www.npr.org/2018/07/09/627266501/the-push-for-agender- neutral-siri
8 Mehrabian, A. & Ferris, S. R. (1967). Inference of attitudes from nonverbal communication in two channels. Journal of Consulting Psychology, 31(3), 248-252.   DOI
9 GwanHae, C., DongUk, C., BumJoo, L., Yeong, P. & YeonMan, J. (2017). A study on characterizing the voices of active announcers using voice analysis technology. The Journal of Korean Institute of Communication and Information Sciences, 42(7), 1422-1431. DOI : 10.7840/kics.2017.42.7.1422.   DOI
10 Homma, T., Obuchi, Y., Shima, K., Ikeshita, R., Kokubo, H. & Matsumoto, T. (2018). In-vehicle voice interface with improved utterance classification accuracy using off-the-shelf cloud speech recognizer. IEICE Transactions on Information and Systems, E101D(12), 3123-3137. DOI : 10.1587/transinf.2018EDK0001.
11 Scanion, P., Elliss, D. P. W. & Reilly, R. B. (2007). Using broad phonetic group experts for improved speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 15(3), 803-802. DOI : 10.1109/TASL.2006.885907.   DOI
12 Ramirez, J., Segura, J. C., Gorriz, J. M. & Garcia, L. (2007). Improved voice activity detection using contextual multiple hypothesis testing for robust speech recognition. IEEE Transactions on Audio, Speech & Language Processing, 15(8), 2177-2189. DOI : 10.1109/TASL.2007.903937.   DOI
13 McCroskey, J. C. & McCain, T. A. (1974). The measurement of interpersonal attraction. Speech Monographs, 41, 261-266. DOI : 10.1080/03637757409375845.   DOI
14 Peterson, R. A., Cannito, M. P. & Brown, S. P. (1995). An exploratory investigation of voice characteristics and selling effectiveness. Journal of Personal Selling & Sales Management, 15(1), 1-15.
15 Suzuki, H., Zen, H., Nankaku, Y., Miyajima, C., Tokuda, K. & Kitamura, T. (2003). Speech recognition using voice-characteristic-dependent acoustic models. Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing , Hong Kong, China. DOI : 10.1109/ICASSP.2003.1198887.
16 McCroskey, J. C. & Teven, J. J. (1999). Goodwill: A reexamination of the construct and its measurement. Communication Monographs, 66, 90-103. DOI : 10.1080/03637759909376464.   DOI
17 Chad, E., Automn, E., Brett, S., Xialing, L. & Noelle, M. (2019). Evaluations of an artificial intelligence instructor's voice: Social identity theory in human-robot interactions. Computers in Human Behavior, 90, 357-362. DOI : 10.1016/j.chb.2018.08.027.   DOI