Browse > Article
http://dx.doi.org/10.15701/kcgs.2020.26.4.9

Avatar's Lip Synchronization in Talking Involved Virtual Reality  

Lee, Jae Hyun (중앙대학교)
Park, Kyoungju (중앙대학교)
Abstract
Having a virtual talking face along with a virtual body increases immersion in VR applications. As virtual reality (VR) techniques develop, various applications are increasing including multi-user social networking and education applications that involve talking avatars. Due to a lack of sensory information for full face and body motion capture in consumer-grade VR, most VR applications do not show a synced talking face and body. We propose a novel method, targeted for VR applications, for talking face synced with audio with an upper-body inverse kinematics. Our system presents a mirrored avatar of a user himself in single-user applications. We implement the mirroring in a single user environment and by visualizing a synced conversational partner in multi-user environment. We found that a realistic talking face avatar is more influential than an un-synced talking avatar or an invisible avatar.
Keywords
Human-centered computing; virtual reality; face animation; avatar; lip synchronization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. Kilteni, R. Groten, and M. Slater. "The Sense of Embodiment in Virtual Reality". Presence: Teleoperators and Virtual Environments, 21(4):373-87, 2012.   DOI
2 M. Slater and A. Steed. "A virtual presence counter. Presence: Teleoperators and Virtual Environments", 9(5):413-434, 2000.
3 M. Parger, J. H. Mueller, D. Schmalstieg, and M. Steinberger, "Human upper-body inverse kinematics for increased embodiment in consumer-grade virtual reality", Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology-VRST '18, pp. 1-10, 2018.
4 K. Olszewski, J-J. Lim, S. Saito, and H. Li, "High-Fidelity Facial and Speech Animation for VR HMDs", ACM Trans Graph, Vol.35, No. 6,2016
5 K. Vougioukas, S. Petridis, and M. Pantic."Realistic Speech-Driven Facial Animation with GANs", Computer Vision and Pattern Recognition, pp. 1-16, 2019.
6 T. Frank, M. Hoch, and G. Trogemann, "Automated lip-sync for 3d-character animation", 15th IMACS World Congress on Scientific Computation Modelling and Applied Mathematics, August 1997.
7 L. N. Hoon, K. A. A. A. Rahman, and W. Y. Chai, "Framework development of real-time lip sync animation on viseme based human speech", Jurnal Teknologi, Vol. 75, No. 4, pp. 43-48, 2015.
8 S. H. Park, S. H. JI, D. S. Ryu, and H.G. Ch, "A new cognition-based chat system for avatar agents in virtual space", Proceedings of the 7th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and Its Applications in Industry Article No. 13-VRCAI '08, 2008.
9 Y. LEE, D. Terzopoulos, and K. Waters, "Realistic Modeling for Facial Animation", In:Computer Graphics (SIGGRAPH Proc.), pp 55-62, 1995.
10 H. C. Yehia, T. Kuratate, and E. Vatikiotis-Bateson, "Linking facial animation, head motion and speech acoustics", Journal of Phonetics, 2002.
11 R. Kumar, J. Sotelo, K. Kumar, A. de-Brebisson, and Y. Bengio, "Obamanet: Photo-realistic lip-sync from text". In NIPS 2017 Workshop on Machine Learning for Creativity and Design, 2017.
12 S. Suwajanakorn, S. M. Seitz, I. Kemelmacher-Shlizerman, "Synthesizing Obama: Learning Lip Sync from Audio", ACM Transactions on Graphics (TOG), Vol. 36, No. 4, July 2017.
13 D. Roth, J. L. Lugrin, J. Buser, G. Bente, A. Fuhrmann, and Marc E. Latoschik, "A simplified inverse kinematic approach for embodied vr applications", In in Proceedings of the 23rd IEEE Virtual Reality (IEEE VR) conference, 2016.
14 D. Medeiros, R. K. dos Anjos, D. Mendes, J. M. Pereira, A. Raposo, and J. Jorge, "Keep my head on my shoulders!: why third-person is bad for navigation in VR", Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology-VRST '18, pp. 1-11 , 2018.
15 D. Bijl, H.-T. Henry, "Speech to Text Conversion". U.S. Patent. No. 6,173,259, 9 January 2001.
16 M. A. Siegler, "Measuring and Compensating for the Effects of Speech Rate in Large Vocabulary Continuous Speech Recognition", Thesis, Carnegie Mellon University,1995
17 J. Busby, Z. Parrish, J. V. Eenwyk, " Mastering Unreal technology. Volume II: Advanced level design concepts with Unreal Engine 3 " in, Sams Publishing, 2010.
18 D. Roth, J. L. Lugrin, J. Buser, G. Bente, A. Fuhrmann, and Marc E. Latoschik, "A simplified inverse kinematic approach for embodied vr applications", In in Proceedings of the 23rd IEEE Virtual Reality (IEEE VR) conference, 2016.
19 B. G. Witmer, M. J. Singer, "Measuring presence in virtual environments: A presence questionnaire", Presence: Teleoperators and Virtual Environments, Vol. 7, No. 3, pp. 225-240, 1998.
20 E. Catmull. A tutorial on compensation tables. In Computer Graphics, volume 13, pages 1-7. ACM SIGGRAPH, 1979.   DOI
21 J. Adorf, "Web Speech API", KTH Royal Institute of Technology, 2013.