Browse > Article
http://dx.doi.org/10.5909/JBE.2019.24.2.227

A Sound Interpolation Method Using Deep Neural Network for Virtual Reality Sound  

Choi, Jaegyu (Dept. of Electronic and IT Media Engineering, Seoul National University of Science and Technology)
Choi, Seung Ho (Dept. of Electronic and IT Media Engineering, Seoul National University of Science and Technology)
Publication Information
Journal of Broadcast Engineering / v.24, no.2, 2019 , pp. 227-233 More about this Journal
Abstract
In this paper, we propose a deep neural network-based sound interpolation method for realizing virtual reality sound. Through this method, sound between two points is generated by using acoustic signals obtained from two points. Sound interpolation can be performed by statistical methods such as arithmetic mean or geometric mean, but this is insufficient to reflect actual nonlinear acoustic characteristics. In order to solve this problem, in this study, the sound interpolation is performed by training the deep neural network based on the acoustic signals of the two points and the target point, and the experimental results show that the deep neural network-based sound interpolation method is superior to the statistical methods.
Keywords
VR sound; Deep Neural Network; Sound Interpolation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Veaux Christophe, Yamagishi Junichi, and MacDonald Kirsten, "CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit," The Centre for Speech Technology Research (CSTR), 2016.
2 V. Nair and G. E. Hinton, "Rectified linear units improve restricted Boltzmann machines," in Proc. 27th Int. Conf. Machine Learning, pp. 807-814, 2010.
3 Vu Pham, Theodore Bluche, Christopher Kermorvant, and Jerome Louradour, "Dropout improves recurrent neural networks for handwriting recognition," Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference, pp. 285-290, IEEE, 2014.
4 D. P. Kingma and J. L. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
5 T. Qu, Z. Xiao, M. Gong, Y. Huang, X. Li, and X. Wu, "Distance dependent head-related transfer functions measured with high spatial resolution using a spark gap," IEEE Trans. on Audio, Speech and Language Processing, vol. 17, no. 6, pp. 1124-1132, 2009.   DOI
6 J. Wen, N. Gaubitch, E. Habets, T. Myatt, P. Naylor, "Evaluation of speech dereverberation algorithms using the MARDY database", Proc. Int. Workshop Acoust. Echo Noise Control, pp. 1-4, 2006.