1 |
Lu, L., Jiang, H., & Zhang, H. (2001). A robust audio classification and segmentation method, in Proc. ACM International Conference on Multimedia, 203-211.
|
2 |
Xu, M., et al. (2003). Creating audio keywords for event detection in soccer video, in Proc. IEEE International Conference on Multimedia and Expo, 281-284.
|
3 |
Cheng, W., Chu, W., and Wu, J. (2003). Semantic context detection based on hierarchical audio models, in Proc. ACM SIGMM International Workshop on Multimedia Information Retrieval, 109-115.
|
4 |
Elo, J. P., et al. (2009). Non-speech audio event detection, in Proc. Internationa Conference on Acoustics, Speech and Signal Processing, 1973-1976.
|
5 |
Heittola, T., et al. (2013). Context-dependent sound event detection, EURASIP Journal on Audio, Speech, and Music Processing, 11-13.
|
6 |
Lee, H., Pham, P., Largman, Y., & Ng, A. Y. (2009). Unsupervised feature learning for audio classification using convolutional deep belief networks. in Proc. Advances in Neural Information Processing Systems, 1096-1104.
|
7 |
K, Zvi., & T, Orith. (2013). Audio event classification using deep neural networks, in Proc. INTERSPEECH, 1482-1486.
|
8 |
Ballan, L., et al. (2009). Deep networks for audio event classification in soccer videos, in Proc. International Conference on Multimedia and Expo, 474-477.
|
9 |
Bengio, Y. & LeCun, Y. (2007). Scaling learning algorithms towards AI, Large-scale Kernel Machines, Vol. 34, No.5, 321-360.
|
10 |
Barker, J., et al. (2012). The PASCAL CHiME speech separation and recognition challenge, Computer Speech & Language, Vol. 27, No. 3, 621-633.
DOI
|
11 |
Downie, S., et al. (2010). The Music Information Retrieval Evaluation eXchange: Some observations and insights, Advances in Music Information Retrieval. Springer, 93-115.
|
12 |
Malkin, R. G. (2007). Multimodal Technologies for Perception of Humans. Springer, 323-330.
|
13 |
Smeaton, F. et al. (2006). Evaluation campaigns and TRECVid, in Proc. ACM International Workshop on Multimedia Information Retrieval, 321-330.
|
14 |
Vincen, E., et al. (2012). The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges, Signal Processing, Vol. 92, No. 8, 1928-1936.
DOI
|
15 |
Larochelle, H., et al. (2007). An empirical evaluation of deep architectures on problems with many factors of variation. in Proc. International Conference on Machine learning, 473-480.
|
16 |
Young, S., et al. (1999). The HTK Book. Cambridge, U.K.: Entropic.
|
17 |
Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for LVCSR using rectified linear units and dropout, in Proc. International Conference on Acoustics, Speech and Signal Processing, 8609-8613.
|
18 |
Bottou, L. (2004). Advanced Lectures on Machine Learning, Sringer, 146-168.
|
19 |
Salamon, J., Jacoby, C., & Bello, J. P. (2014), A dataset and taxonomy for urban sound research, in Proc. ACM International Conference on Multimedia, 1041-1044.
|
20 |
Bergstra, J., et al. (2010). Theano: A CPU and GPU math expression compiler. in Proc. Python for Scientific Computing Conference, Vol. 4, p. 3.
|