[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7776/ASK.2014.33.4.248

Frequency-Cepstral Features for Bag of Words Based Acoustic Context Awareness

Park, Sang-Wook (고려대학교 전기전자전파공학과)
Choi, Woo-Hyun (고려대학교 전기전자전파공학과)
Ko, Hanseok (고려대학교 전기전자전파공학과)

Publication Information

The Journal of the Acoustical Society of Korea / v.33, no.4, 2014 , pp. 248-254 More about this Journal

Abstract

Among acoustic signal analysis tasks, acoustic context awareness is one of the most formidable tasks in terms of complexity since it requires sophisticated understanding of individual acoustic events. In conventional context awareness methods, individual acoustic event detection or recognition is employed to generate a relevant decision on the impending context. However this approach may produce poorly performing decision results in practical situations due to the possibility of events occurring simultaneously or the acoustically similar events that are difficult to distinguish with each other. Particularly, the babble noise acoustic event occurring at a bus or subway environment may create confusion to context awareness task since babbling is similar in any environment. Therefore in this paper, a frequency-cepstral feature vector is proposed to mitigate the confusion problem during the situation awareness task of binary decisions: bus or metro. By employing the Support Vector Machine (SVM) as the classifier, the proposed feature vector scheme is shown to produce better performance than the conventional scheme.

Keywords

Acoustic context awareness; Acoustic event detection or recognition; Bag of Words (BOW); Support Vectors Machine (SVM);

Citations & Related Records

Reference

1	T. Heittola, A. Mesaros, A. Eronen, and T. Virtanen, "Context-dependent sound event detection," EURASIP J. Audio, Speech, and Music Process. 1, 1-13 (2013).
2	T. Heittola, A. Mesaros, T. Virtanen, and M. Gabbouj, "Supervised model training for overlapping sound events based on unsupervised source separation," in Proc IEEE Inter. Conf. Acoust., Speech, and Sig. Process. 8677-8681 (2013).
3	S. Rawat, P. F. Schulam, S. Burger, D. Ding, Y. Wang, and F. Metze, "Robust audio-codebooks for large-scale event detection in consumer videos," in Proc. Interspeech, 2929-2933 (2013).
4	A. Temko, E. Monte, and C. Nadeu, "Comparison of sequence discriminant support vector machines for acoustic event classification," in Proc. IEEE Inter. Conf. Acoust., Speech, and Sig. Process. 5, 721-724 (2006).
5	V. Carletti, P. Foggia, G. Percannella, A. Saggese, N. Strisciuglio, and M. Vento, "Audio surveillance using a bag of aural words classifier," in Proc. IEEE Inter. Conf. Ad. Video and Sig. Surveillance, 81-86 (2013).
6	T. George and P. Cook, "Musical genre classification of audio signals," IEEE Trans. Speech and Audio Process. 10. 5, 293-302 (2002). DOI ScienceOn
7	The HTK book Version 3.4, Cambridge University Engineering Department, (2009).
8	K. Kim and H. Ko, "Hierarchical approach for abnormal acoustic event classification in an elevator," in Proc. IEEE Inter. Conf. Ad. Video and Sig. Surveillance, 88-94 (2011).
9	W.H. Choi, S.I. Kim, M.S. Keum, D.K. Han, and H. Ko, "Acoustic and visual signal based context awareness system for mobile application," IEEE Trans. Cons. Elec. 57. 2 738-746 (2011). DOI ScienceOn
10	B. Clarkson, N. Sawhney, and A. Pentland, "Auditory context awareness via wearable computing," in Proc. Works. Perceptual User Interface, 37-42 (1998).
11	L. Ma, B. Milner, and D. Smith, "Environmental noise classification for context-aware application," in Proc. Works. Database and Expert Sys. Appl. 2736, 360-370 (2003).
12	L. Ma, B. Milner, and D. Smith, "Acoustic environment classification," ACM Trans. Speech and Lang. Process. 3, 2, 1-22 (2006).
13	A. J. Eronenm, V. T. Peltonen, J. T. Tuomi, A. P. Klapuri, S. Fagerlund, T. Sorsa, G. Lorho, and J. Huopaniemi, "Audio-based context recognition," IEEE Trans. Audio, Speech, and Lang. Process. 14, 321-329 (2006). DOI ScienceOn
14	T. Nishiura, S. Nakamura, K. Miki, and K. Shikano, "Environmental sound source identification based on hidden Markov model for robust speech recognition," in Proc. Eurospeech-Interspeech, 2157-2160 (2003).
15	P. Gaunard, C.G. Mubikangiey, C. Couvneur, and V. Fontaine, "Automatic classification of environmental noise events by hidden Markov model," in Proc IEEE Inter. Conf. Acoust., Speech, and Sig. Process. 6, 3609-3612 (1998).
16	G. Guo and S.Z. Li, "Content-based audio classification and retrieval by support vector machines," IEEE Trans. Neural Networks 14, 209-215 (2003). DOI ScienceOn

KSCI

Frequency-Cepstral Features for Bag of Words Based Acoustic Context Awareness Bag of Words 기반 음향 상황 인지를 위한 주파수-캡스트럴 특징

Frequency-Cepstral Features for Bag of Words Based Acoustic Context Awareness