Browse > Article
http://dx.doi.org/10.5909/JBE.2014.19.3.296

Non-Dialog Section Detection for the Descriptive Video Service Contents Authoring  

Jang, Inseon (Realistic Broadcasting Media Research Department, ETRI)
Ahn, ChungHyun (Realistic Broadcasting Media Research Department, ETRI)
Jang, Younseon (Dept. of Electronic Engineering, Chungnam National University)
Publication Information
Journal of Broadcast Engineering / v.19, no.3, 2014 , pp. 296-306 More about this Journal
Abstract
This paper addresses a problem of non-dialog section detection for the DVS authoring, the goal of which is to find meaningful section from the broadcasting audio, where audio description can be inserted. The broadcasting audio involves the presence of various sounds so that it first discriminates between speech and non-speech for each audio frame. Proposed method jointly exploits the inter-channels structure and speech source characteristics of the broadcasting audio whose number of channel is stereo. Also, rule based post-processing is finally applied to detect the non-dialog section whose length is appropriate for audio description. Proposed method provides more accurate detection compared to conventional method. Experimental results on real broadcasting contents show that qualitative superiority of the proposed method.
Keywords
Speech/Non-speech Detection; Descriptive Video Service;
Citations & Related Records
연도 인용수 순위
  • Reference
1 B. Elizalde, G. Friedland, "Lost in segmentation: three approaches for speech/non-speech detection in consumer-produced videos," in Proc. ICME, SanJose, USA, July 2013.
2 T. Ng, B. Zhang, L. Nguyen, S. Matsoukas, X. Zhou, N. Mesgarani, K. Vesely, and P.l Matejka, "Developing a speech activity detection system for the DARPA RATS program," in Proc. Interspeech, 2012.
3 H. Meinedo and J. Neto, "Audio segmentation, classification and clustering in a broadcast news task," in Proc. ICASSP, pp. II 5-8, 2003.
4 Korea Employment Agency for the Disabled, 2013 the disables statistics, Ministry of Employment and Labor, April 2013.
5 M. Park, ITU Activities for improving ICT accessibility of disabled people, Policy of Broadcasting and Telecommunication, vol 25, no. 12, July 2013.
6 L. Lu, S. Li, and H. J. Zhang, "Content-based audio segmentation using support vector machines," in Proc. ICME, pp. 749-752, 2001.
7 G. Jung, Management of TV System and Image Production, Cheongmoongak publishing co., 2009.
8 Korean Association for Broadcasting & Telecommunication Studies, Study on improving the media accessibility of broadcasting alienation class including the blind and the deaf, Korea Communications Commission, Dec. 2010.
9 http://www.miranda.com/family/12/Audio_or_Video_Description
10 ITU-T BT.2207-2 (11/2012) Accessibility to broadcasting services for persons with disabilities. (http://www.itu.int/pub/R-REP-BT.2207-2-2012)
11 Korea Communications Commission Announcement issue 2011-53, "Announcement of broadcasting access right guarantee for the disabled, which is including organizing and providing the broadcasting for the disabled, " Dec. 2011.
12 A. Szarkowska, "Text-to-speech audio description: towards wider availability of AD", Journal of Specialised Translation 15, pp. 142-163, 2011.
13 W. Lim, C. Ahn, "Descriptive video service using text to speech," in Proc. Conference of the Korean Society of Broadcast Engineers, June 2013.   과학기술학회마을