Convolutional Neural Network based Audio Event Classification |
Lim, Minkyu
(Dept. of Computer Science and Engineering, Sogang University)
Lee, Donghyun (Dept. of Computer Science and Engineering, Sogang University) Park, Hosung (Dept. of Computer Science and Engineering, Sogang University) Kang, Yoseb (Dept. of Computer Science and Engineering, Sogang University) Oh, Junseok (Dept. of Computer Science and Engineering, Sogang University) Park, Jeong-Sik (Dept. of English Linguistics & Language Technology, Hankuk University of Foreign Studies) Jang, Gil-Jin (School of Electronics Engineering, Kyungpook National University) Kim, Ji-Hwan (Dept. of Computer Science and Engineering, Sogang University) |
1 | K. Kim and H. Kim, "Storytelling Strategy of Visual-Image Contents base on Rhetoric Metaphors," Journal of Digital Content Society, vol. 14, no. 4, pp. 481-491, December, 2013. DOI |
2 | L. Lu, H. Jiang and H. Zhang, "A robust audio classification and segmentation method," in Proc. of ACM International Conference on Multimedia, pp. 203-211, September 30-October 5, 2001. |
3 | M. Xu, N. Maddage, C. Xu, M. Kankanhalli and Q. Tian, "Creating audio keywords for event detection in soccer video," in Proc. of IEEE International Conference on Multimedia and Expo, pp.281-284, July 6-9, 2003. |
4 | W. Cheng, W. Chu and J. Wu, "Semantic context detection based on hierarchical audio models," in Proc. of ACM SIGMM International Workshop on Multimedia Information Retrieval, pp.109-115, November 7-7, 2003. |
5 | H. Lee, P. Pham, Y. Largman and Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Proc. of Advances in Neural Information Processing Systems, pp.1096-1104, December 7-10, 2009. |
6 | Y. Bengio and Y. LeCun, "Large-scale Kernel Machines," MIT Press, 2007. |
7 | K. Zvi and T. Orith, "Audio event classification using deep neural networks," in Proc. of Interspeech, pp.1482-1486, August 25-29, 2013. |
8 | J. Portelo, M. Bugalho, I. Trancoso, J. Neto, A. Abad and A. Serralheiro, "Non-speech audio event detection," in Proc. of Internationa Conference on Acoustics, Speech and Signal Processing, pp.1973-1976, April 19-24, 2009. |
9 | L. Ballan, A. Bazzica and M. Bertini, A. Bimbo, and G. Serra, "Deep networks for audio event classification in soccer videos," in Proc. of International Conference on Multimedia and Expo, pp.474-477, June 28-3, 2009. |
10 | T. Heittola, A. Mesaros, A. Eronen and T. Virtanen, "Context-dependent sound event detection," EURASIP Journal on Audio, Speech, and Music Processing, vol.1, pp.1-13, January, 2013. |
11 | S. Downie, et al., "The Music Information Retrieval Evaluation eXchange: Some observations and insights," Advances in Music Information Retrieval, pp. 93-115, 2010. |
12 | R. Malkin, "Multimodal Technologies for Perception of Humans," Springer, pp. 323-330, 2007. |
13 | M. Lim and J. Kim, "Audio Event Classification Using Deep Neural Networks," Phonetics and Speech Sciences, vol. 7, no. 4, pp.27-33, January, 2015. DOI |
14 | F. Smeaton, et al., "Evaluation campaigns and TRECVid," in Proc. of ACM International Workshop on Multimedia Information Retrieval, pp. 321-330, 2006. |
15 | E. Vincent, et al., "The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges," Signal Processing, vol. 82, no. 8, pp. 1928-1936, 2012. |
16 | H. Larochelle, et al., "An empirical evaluation of deep architectures on problems with many factors of variation," in Proc. of International Conference on Machine Learning, pp.473-480, 2007. |
17 | J. Salamon, C. Jacoby and J. Bello, "A dataset and taxonomy for urban sound research," in Proc. of ACM International Conference on Multimedia, pp.1041-1044, November 3-7, 2014. |
18 | M. Slaney, "Semantic-audio retrieval," in Proc. of International Conference on Acoustics, Speech and Signal Processing, pp.1408-1411, May 13-17, 2002. |
19 | A. Mesaros, T. Heittola, and T. Virtanen, "TUT database for acoustic scene classification and sound event detection," in Proc. of 24th European Signal Processing Conference, pp. 1128-1132, 2016. |
20 | S. Young, G. Evermann, M. Gales and P. Woodland, "The HTK book (for HTK version 3.4)," Entropic Cambridge Research Laboratory, 2006. |
21 | M. Abadi, A. Agarwal, et al, "Tensorflow: Large-scale machine learning on heterogeneous distributed systems," 2016, Preprint at. |
22 | Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, pp.436-444, May, 2015. DOI |