Browse > Article
http://dx.doi.org/10.9717/kmms.2021.24.12.1606

A Study on Hazardous Sound Detection Robust to Background Sound and Noise  

Ha, Taemin (Dept. of Electronic & Electrical Engineering, Hongik University)
Kang, Sanghoon (Dept. of Electronic & Electrical Engineering, Hongik University)
Cho, Seongwon (Dept. of Electronic & Electrical Engineering, Hongik University)
Publication Information
Abstract
Recently various attempts to control hardware through integration of sensors and artificial intelligence have been made. This paper proposes a smart hazardous sound detection at home. Previous sound recognition methods have problems due to the processing of background sounds and the low recognition accuracy of high-frequency sounds. To get around these problems, a new MFCC(Mel-Frequency Cepstral Coefficient) algorithm using Wiener filter, modified filterbank is proposed. Experiments for comparing the performance of the proposed method and the original MFCC were conducted. For the classification of feature vectors extracted using the proposed MFCC, DNN(Deep Neural Network) was used. Experimental results showed the superiority of the modified MFCC in comparison to the conventional MFCC in terms of 1% higher training accuracy and 6.6% higher recognition rate.
Keywords
Background Sound; Noise; Robust; Hazardous; Sound Detection;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 M. Sadeghi and H. Marvi, "Optimal MFCC Features Extraction by Differential Evolution Algorithm for Speaker Recognition," 3rd Iranian Conference on Intelligent Systems and Signal Processing (ICSPIS), pp. 169- 173, 2017.
2 N. Dave, "Feature Extraction Methods LPC PLP and MFCC in Speech Recognition," International Journal for Advance Research in Engineering and Technology, Vol. 1, Issue 6, pp. 1-5, 2013.
3 K. Chen, J. Benesty, Y. Huong, and S. Doclo, "New Insights into the Noise Reduction Wiener Filter," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, Issue 4, pp. 1218-1234, July 2006.   DOI
4 J.S. Choi, "Noise Reduction Algorithm in Speech by Wiener Filter," The Journal of the Korea Institute of Electronic Communication Sciences, Vol. 8, Issue 9, pp. 1293-1298, 2013.   DOI
5 H.Y. Jheng, Y.H. Chen, S.J. Ruan, and Z. Qi, "FPGA Implementation of High Sampling Rate In-Car Non-Stationary Noise Cancellation Based on Adaptive Wiener Filter," IEEE/ IFIP 19th International Conference on VLSI and System-on-Chip, pp. 114-117, 2011.
6 S. Chung, S. Cho, K. Lee, Q.N. Viet, H. Kang, and T. Seol, "Real-time Audio Surveillance System for PTZ Camera," Proceedings of the 2013 International Conference on Advanced Technologies for Communications (ATC). IEEE, pp. 392-397, 2013.
7 A.P. Dempster, N.M. Laird and D.B, Rubin, "Maximum Likelihood from Incomplete Data Via the EM Algorithm," Journal of the Royal Statistical Society, Vol. 39, Issue 1, pp. 1-22, 1977.   DOI
8 Md. Sahidullah and S. Goutam. "Design, Analysis and Experimental Evaluation of Block Based Transformation in MFCC Computation for Speaker Recognition," Speech Communication, Vol. 54, Issue 4, pp. 543-565, 2012.   DOI
9 A. FirozShah, V. Vimal Krishnan, A. RajiSukumar, A. Jayakumar, and P. Babu Anto, "Speaker Independent Automatic Emotion Recognition from Speech: A Comparison of MFCCs and Discrete Wavelet Transforms", International Conference on Advances in Recent Technologies in Communication and Computing, pp. 528-531, 2009.
10 B. Jaramillo, E. Belalcazar-Bolanos, T. Villa- Canas, J.R. Orozco-Arroyave, J.D. Arias Londono, and J.F. Varagas-Bonnilla "Automatic Emotion Detection in Speech Using Mel frequency Cepstral Coefficients," XVII Symposium of Image, Signal Processing, and Artificial Vision (STSIVA), pp. 62-65, 2012.
11 S. Suk, M. Kim, K. Kim, H. Jung and H. Chung, "Multimedia Signal Processing : An On-line Speech and Character Combined Recognition System for Multimodal Interfaces," Journal of Korea Multimedia Society, Vol. 6, No. 2, pp. 216-223, 2003.
12 I. Goodfellows, Y. Bengio, and A. Courvile, Deep Learning, MIT Press, 2016.
13 S. Yu, "Development of PM10 Forecasting Model for Seoul Based on DNN Using East Asian Wide Area Data," Journal of Korea Multimedia Society, Vol. 22, No. 11, pp. 1300-1312, 2019.
14 S. M. Gang and J. J. Lee, "Coreset Construction for Character Recognition of PCB Components Based on Deep Learning," Journal of Korea Multimedia Society, Vol. 24, No. 3, pp. 382-395, 2021.   DOI
15 S. Cho, Multimedia Fusion Based Smart Dimming Control System, Research Report, Hongik University, 2021.
16 J. Lee, H. Choi, D. Park, Y. Chung, H.Y. Kim, and S. Yoon, "Fault Detection and Diagnosis of Railway Point Machines by Sound Analysis," Sensors, Vol. 16, No. 4, pp. 549, 2016.   DOI