Browse > Article

Audio-Visual Content Analysis Based Clustering for Unsupervised Debate Indexing  

Keum, Ji-Soo (경희대학교 컴퓨터공학과)
Lee, Hyon-Soo (경희대학교 컴퓨터공학과)
Abstract
In this research, we propose an unsupervised debate indexing method using audio and visual information. The proposed method combines clustering results of speech by BIC and visual by distance function. The combination of audio-visual information reduces the problem of individual use of speech and visual information. Also, an effective content based analysis is possible. We have performed various experiments to evaluate the proposed method according to use of audio-visual information for five types of debate data. From experimental results, we found that the effect of audio-visual integration outperforms individual use of speech and visual information for debate indexing.
Keywords
Audio-Visual Content Analysis; Unsupervised Debate Indexing; Speaker Indexing;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 한학용, 허강인, 김수훈, "오디오 데이터의 특징 파라메터 구성 에 따른 내용기반 분석," 한국음향학회지 21(2), 182-189, 2002
2 Yuya Akita, Masahiro Hasegawa, Tatsuya Kawahara, "Automatic Audio Archiving System for Panel Discussions," in Proc. International Conference on Multimedia & Expo, 3, 1895 -1862, 2004
3 Alfred Dielmann, Steve Renals, "Automatic Meeting Segmentation Using Dynamic Bayesian Networks," IEEE Trans. Multimedia, 9(1), 25-36, 2007   DOI   ScienceOn
4 Xuejing, "Pitch Determination and Voice Quality Analysis using Subharmonic-to-Harmonic Ratio," in Proc. International Conference on Acoustics, Speech, and Signal Processing, 1, 333-336, 2002
5 손종목, 배건성, 강경옥, 김재곤, "내용기반 비디오 색인 및 검색 을 위한 음성인식기술 이용에 관한 연구," 한국음향학회지,20(2), 16-20, 2001
6 금지수, 임성길, 이현수, "스펙트럼 분석과 신경망을 이용한 음성 /음악 분류," 한국음향학회지, 26(5), 207-213, 2007   과학기술학회마을
7 Soonil Kwon, Shrikanth Narayanan, "Unsupervised Speaker Indexing Using Generic Models," IEEE Trans. Speech and Audio Proc. 13(5), 1004-1013, 2005
8 Ki Tae Park, Doo Sun Hwang, Young Shik Moon, "Anchor Frame Detection in News Video Using Anchor Object Extraction," IEICE Trans. Fund., E88-A(6), 1525-1528, 2005   DOI
9 Min Yang, Yingchun Yang, Zhaohui Wu, "A Pitch-based Rapid Speech Segmentation for Speaker Indexing," in Proc. IEEE International Symposium on Multimedia, 2005
10 Maria Zapata Ferrer, Mauro Barbieri, Hans Weda, "Automatic Classification of Field of View in Video," in Proc. International Conference on Multimedia & Expo, 1609-1612, 2006
11 Scott Shaobing Chen, P.S. Gopalakrishnan, "Speaker, Environment and Channel Change Detection and Clustering via The Bayesian Information Criterion," DARPA Broadcast News Transcription & Understanding Workshop, 1998
12 P. Delacourt, C. J. Wellekens, "DISTBIC: A Speaker Based Segmentation for Audio Data Indexing," Speech Communication, 32, 111-126, 2000   DOI   ScienceOn
13 Sue E. Tranter, Douglas A. Reynolds, "An Overview of Automatic Speaker Diarization Systems," IEEE Trans. Audio, Speech and Language Proc. 14(5), 1557-1565, 2006
14 Ying Li, Shrikanth Narayanan, C.-C. Jay Kuo, "Audiovisual -based Adaptive Speaker Identification," in Proc. International Conference on Acoustics, Speech, and Signal Processing, 5, 812-815, 2003
15 Alberto Albiol, Luis Torres, Edward J. Delp. "The Indexing of Persons in News Sequences using Audio-visual Data," in Proc. International Conference on Acoustics, Speech, and Signal Processing, 3, 137-140, 2003
16 Xinbo Gao, Xiaoou Tang, "Unsupervised Video-shot Segmentation and Model-free Anchorperson Detection for News Video Story Parsing," IEEE Trans. Circuits and Systems for Video Technology. 12(4), 765-776, 2002   DOI   ScienceOn