Audio Event Detection Using Deep Neural Networks |
Lim, Minkyu
(Department of Computer Science and Engineering, Sogang University)
Lee, Donghyun (Department of Computer Science and Engineering, Sogang University) Park, Hosung (Department of Computer Science and Engineering, Sogang University) Kim, Ji-Hwan (Department of Computer Science and Engineering, Sogang University) |
1 | E. Dahl, N. Sainath, and E. Hinton, "Improving deep neural networks for LVCSR using rectified linear units and dropout," in Proceeding of International Conference on Acoustics, Speech and Signal Processing, Vancouver, pp.8609-8613, 2013. |
2 | L. Bottou, Advanced Lectures on Machine Learning, Springer, pp. 146-168, 2004. |
3 | J. Salamon, C. Jacoby, and J. Bello, "A dataset and taxonomy for urban sound research," in Proceeding of ACM International Conference on Multimedia, Orlando: FL, pp.1041-1044, 2014. |
4 | M. Slaney, "Semantic-audio retrieval," in Proceeding of International Conference on Acoustics, Speech and Signal Processing, Orlando: FL, pp.1408-1411, 2002. |
5 | S. Young, G. Evermann, M. Gales, and P. Woodland, The HTK book (for HTK version 3.4), Cambridge, U.K.: Entropic, 2006. |
6 | M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, Tensorflow: Large-scale machine learning on heterogeneous distributed systems, Available: https://www.tensorflow.org/ |
7 | K. Kim and H. Kim, "Scaling learning algorithms towards AI," Journal of Digital Content Society, Vol. 14, No.4, pp.481-491, December, 2013. DOI |
8 | L. Lu, H. Jiang, and H. Zhang, "A robust audio classification and segmentation method," in Proceeding of ACM International Conference on Multimedia, Ottawa, pp.203-211, 2001. |
9 | W. Cheng, W. Chu, and J. Wu, "Semantic context v detection based on hierarchical audio models," in Proceeding of ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley: CA, pp.109-115, 2003. |
10 | M. Xu, N. Maddage, C. Xu, M. Kankanhalli, and Q. Tian, "Creating audio keywords for event detection in soccer video," in Proceeding of IEEE International Conference on Multimedia and Expo, Baltimore: MD, pp.281-284, 2003. |
11 | H. Lee, P. Pham, Y. Largman, and Y. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Proceeding of Advances in Neural Information Processing Systems, Vancouver, pp.1096-1104, 2009. |
12 | Y. Bengio and Y. LeCun, "Scaling learning algorithms towards AI," Large-scale Kernel Machines, Vol. 34, No.5, pp.321-360, August, 2007. |
13 | J. Portelo, M. Bugalho, I. Trancoso, J. Neto, A. Abad, and A. Serralheiro, "Non-speech audio event detection," in Proceeding of Internationa Conference on Acoustics, Speech and Signal Processing, Taipei, pp.1973-1976, 2009. |
14 | L. Ballan, A. Bazzica, M. Bertini, A. Bimbo, and G. Serra, "Deep networks for audio event classification in soccer videos," in Proceeding of International Conference on Multimedia and Expo, Cancun, pp.474-477, 2009. |
15 | T. Heittola, A. Mesaros, A. Eronen, T. Virtanen, "Scaling learning algorithms towards AI," EURASIP Journal on Audio, Speech, and Music Processing, Vol.1, pp.1-13, January, 2013. |
16 | H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio, "An empirical evaluation of deep architectures on problems with many factors of variation," in Proceeding of International Conference on Machine learning, Corvaliis: OR, pp.473-480, 2007. |
17 | K. Zvi, and T. Orith, "Audio event classification using deep neural networks," in Proceeding of INTERSPEECH, Lyon, pp.1482-1486, 2013. |
18 | M. Lim and J. Kim, "Audio Event Classification Using Deep Neural Networks," Phonetics and Speech Sciences, Vol. 7, No. 4, pp.27-33, January, 2015. DOI |