References
- M. K. Nandwana, A. Ziaei, and J. H. L. Hansen, "Robust Unsupervised Detection of Human Screams In Noisy Acoustic Environments," Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Brisbane, Australia, Apr. 2015.
- M. Crocco, M. Christani, A. Trucco, and V. Murino, "Audio Surveillance: A Systematic Review," ACM Computing Surveys, vol. 48. no. 4, Feb. 2016, pp.52:1-52:46.
- Y. Lee and P. Moon, "A Comparison and Analysis of Deep Learning Framework," J. of the Korea Institute of Electronic Communication Sciences, vol. 12, no. 1, 2017, pp. 115-122. https://doi.org/10.13067/JKIECS.2017.12.1.115
- Y. Wang, L. Neves, and F. Metze, "Audio-based Multimedia Event Detection Using Deep Recurrent Neural Networks," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, Mar. 2016, pp. 2742-2746.
- A. Mesaros, T. Heittola, and T. Virtanen, "Metrics for polyphonic sound event detection," Applied Sciences, vol. 6, no. 6, 2016, pp. 321-337. https://doi.org/10.3390/app6110321
- S. Chung and Y. Chung, "Sound Event Detection based on Deep Neural Networks," J. of the Korea Institute of Electronic Communication Sciences, vol. 14, no. 2, 2019, pp. 389-396. https://doi.org/10.13067/JKIECS.2019.14.2.389
- S. Chung and Y. Chung, "Comparison of Audio Event Detection Performance using DNN," J. of the Korea Institute of Electronic Communication Sciences, vol. 13, no. 3, 2018, pp. 571-577. https://doi.org/10.13067/JKIECS.2018.13.3.571
- A. Graves, A. Mohamed, and G. Hinton, "Speech Recognition with Deep Recurrent Neural Networks," Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 6645-6649.
- E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen, and T. Virtanen, "Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection," IEEE/ACM Trans. On Audio Speech and Language Process., vol. 26. no. 6, 2017, pp. 1291-1303.
- Y. Xu., Q. Kong, Q. Huang, W. Wang, and M. D. Plumbley, "Attention and Localization Based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging," in Proc. Interspeech Aug. 2017, pp. 3083-3087.
- J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio, "Attention-based models for speech recognition," in Advances in Neural Information Processing Systems, Dec. 2015, pp. 577-585.
- V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, "Recurrent models of visual attention," in Advances in Neural Information Processing Systems, 2014, pp. 2204-2212.
- D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in International Conference on Learning Representation(ICLR), May, 2015.
- N. Turpault, R. Serizel, A. P. Shah, and J. Salamon, "Sound event detection in domestic environments with weakly labeled data and soundscape synthesis," Workshop on Detection and Classification of Acoustic Scenes and Events, Oct. 2019.