과제정보
이 논문은 2021년도 정부(과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임(No.2021-0-00014, 재난상황 대응을 위한 엣지컴퓨팅 기반 시청각 인지지능 솔루션 개발).
참고문헌
- T. Virtanen, M. D. Plumbley, and D. Ellis, Computational Analysis of Sound Scenes and Events (Springer, Heidelberg, 2018), Chap. 1.
- J. P. Bello, C. Silva, O. Nov, R. L. Dubois, A. Arora, J. Salamon, C. Mydlarz, and H. Doraiswamy, "SONYC: A system for monitoring, analyzing and mitigating urban noise pollution," Commun. ACM. 62, 68-77 (2019).
- K. Drossos, S. Adavanne, and T. Virtanen, "Automated audio captioning with recurrent neural networks," Proc. IEEE WASPAA. 374-378 (2017).
- Y. Zigel, D. Litvak, and I. Gannot, "A method for automatic fall detection of elderly people using floor vibrations and sound -Proof of concept on human mimicking doll falls," IEEE Trans. Biomed. Eng. 56, 2858-2867 (2009). https://doi.org/10.1109/TBME.2009.2030171
- A. Temko and C. Nadeu, "Acoustic event detection in meeting-room environments," Pattern Recognit. Lett. 30, 1281-1288 (2009). https://doi.org/10.1016/j.patrec.2009.06.009
- A. Mesaros, T. Heittola, A. Eronen, and T. Virtanen, "Acoustic event detection in real life recordings," Proc. EUSIPCO. 1267-1271 (2010).
- T. Heittola, A. Mesaros, A. Eronen, and T. Virtanen, "Context-dependent sound event detection," EURASIP J. Audio, Speech, and Music Process. 2013, 1-13 (2013). https://doi.org/10.1186/1687-4722-2013-1
- E. Cakir, T. Heittola, H. Huttunen, and T. Virtanen, "Polyphonic sound event detection using multi label deep neural networks," Proc. IJCNN. 1-7 (2015).
- H. Zhang, I. McLoughlin, and Y. Song, "Robust sound event recognition using convolutional neural networks," Proc. IEEE ICASSP. 559-563 (2015).
- H. Phan, L. Hertel, M. Maass, and A. Mertins, "Robust audio event recognition with 1-max pooling convolutional neural networks," Proc. Interspeech, 3653-3657 (2016).
- G. Parascandolo, H. Huttunen, and T. Virtanen, "Recurrent neural networks for polyphonic sound event detection in real life recordings," Proc. IEEE ICASSP. 6440-6444 (2016).
- E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen, and T. Virtanen, "Convolutional recurrent neural networks for polyphonic sound event detection," IEEE/ACM Trans. on Audio, Speech, Lang. Process. 25, 1291-1303 (2017). https://doi.org/10.1109/TASLP.2017.2690575
- S. Adavanne, P. Pertila, and T. Virtanen, "Sound event detection using spatial features and convolutional recurrent neural network," Proc. IEEE ICASSP. 771-775 (2017).
- N. Turpault, R. Serizel, A. Shah, and J. Salamon, "Sound event detection in domestic environments with weakly labeled data and soundscape synthesis," Proc. Workshop on DCASE. 253-257 (2019).
- N. Turpault, R. Serizel, S. Wisdom, H. Erdogan, J. R. Hershey, E. Fonseca, P. Seetharaman, and J. Salamon, "Sound event detection and separation: A benchmark on DESED synthetic soundscapes," Proc. IEEE ICASSP. 840-844 (2021).
- D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M. D. Plumbley, "Detection and classification of acoustic scenes and events," IEEE Trans. Multimedia, 17, 1733-1746 (2015). https://doi.org/10.1109/TMM.2015.2428998
- N. K. Kim and H. K. Kim, "Polyphonic sound event detection based on residual convolutional recurrent neural network with semi-supervised loss function," IEEE Access, 9, 7564-7575 (2021). https://doi.org/10.1109/ACCESS.2020.3048675
- Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, "Self-training with noisy student improves ImageNet classification," Proc. IEEE/CVF CVPR. 10687-10698 (2020).
- D. S. Park, W. Chan, Y. Zhang, C.-C. Chiu, B. Zoph, E. D. Cubuk, and Q. V. Le, "Specaugment: A simple data augmentation method for automatic speech recognition," arXiv preprint, arXiv:1904.08779 (2019).
- H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "Mixup: Beyond empirical risk minimization," arXiv preprint, arXiv:1710.09412 (2017).
- L. Delphin-Poulat and C. Plapous, "Mean teacher with data augmentation for DCASE 2019 Task 4," DCASE 2019 Challenge, Tech. Rep., 2019.
- S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolu tional block attention modu le," Proc. ECCV. 3-19 (2018).
- A. Mesaros, T. Heittola, and T. Virtanen, "Metrics for polyphonic sound event detection," Appl. Sci. 6, 162-178 (2016). https://doi.org/10.3390/app6060162
- K. Miyazaki, T. Komatsu, T. Hayashi, S. Watanabe, T. Toda, and K. Takeda "Convolution augmented transformer for semi-supervised sound event detection," Proc. Workshop on DCASE. 100-104 (2020).