Browse > Article
http://dx.doi.org/10.4275/KSLIS.2012.46.1.163

Investigating an Automatic Method in Summarizing a Video Speech Using User-Assigned Tags  

Kim, Hyun-Hee (명지대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for Library and Information Science / v.46, no.1, 2012 , pp. 163-181 More about this Journal
Abstract
We investigated how useful video tags were in summarizing video speech and how valuable positional information was for speech summarization. Furthermore, we examined the similarity among sentences selected for a speech summary to reduce its redundancy. Based on such analysis results, we then designed and evaluated a method for automatically summarizing speech transcripts using a modified Maximum Marginal Relevance model. This model did not only reduce redundancy but it also enabled the use of social tags, title words, and sentence positional information. Finally, we compared the proposed method to the Extractor system in which key sentences of a video speech were chosen using the frequency and location information of speech content words. Results showed that the precision and recall rates of the proposed method were higher than those of the Extractor system, although there was no significant difference in the recall rates.
Keywords
MMR Model; Social Summarization; Redundancy; Acoustic features; Prosodic Features; Extractor; Transcripts;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Marchionini, G., et al. 2009. "Multimedia surrogates for video gisting: Toward combining spoken words and imagery." Information Processing and Management, 45(6): 615-630.   DOI   ScienceOn
2 Murray, G., Renals, S., & Carletta, J. 2005. "Extractive summarization of meeting recordings." Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH), 593-596. Lisbon, Portugal.
3 Song, Y., & Marchionini, G. 2007. "Effects of audio and visual surrogates for making sense of digital video." In Proceedings of CHI 2007, 867-876. San Jose, CA, USA.
4 Turney, P. 2000. "Learning algorithms for keyphrase extraction." Information Retrieval, 2(4): 303-336.   DOI   ScienceOn
5 Xie, S., & Liu, Y. 2008. "Using corpus and knowledge-based similarity measure in maximum marginal relevance for meeting summarization." IEEE International Conference on Acoustics, Speech and Signal Processing, 4985-4988.
6 Xie, S., et al. 2009. "Integrating prosodic features in extractive meeting summarization." Proceedings of Automatic Speech Recognition & Understanding, 2009.
7 Zechner, K. 2002. "Automatic summarization of open-domain multiparty dialogues in diverse genres." Computational Linguistics, 28(4): 447-485.   DOI   ScienceOn
8 Zhang, J., et al. 2007. "A comparative study on speech summarization of broadcast news and lecture speech." In INTERSPEECH-2007, 2781-2784.
9 Zhu, J., et al. 2009. "Tag-oriented document summarization." Proceedings of the 18th International Conference on World Wide Web, 1195-1196.
10 Boydell, O., & Smyth, B. 2010. "Social summarization in collaborative web search." Information Processing and Management, 46(6): 782-798.   DOI   ScienceOn
11 Chen, B., & Lin, S. 2012. "A risk-aware modeling framework for speech summarization." IEEE Transactions on Audio, Speech, and Language Processing, 20(1): 211-222.   DOI
12 Christensen, H., et al. 2003. "Are extractive text summarisation techniques portable to broadcast news?" In Proceedings of Automatic Speech Recognition and Understanding Workshop, 489-494. St. Thomas, USA.
13 Chung, M., Wang, T. & Sheu, P. 2011. "Video summarisation based on collaborative temporal tags." Online Information Review, 35(4): 653-668.   DOI   ScienceOn
14 Goldstein, J., et al. 2000. "Multi-document summarization by sentence extraction." In Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization(NAACL-ANLP-AutoSum '00), Vol.4: 40-48. Stroudsburg, PA, USA: Association for Computational Linguistics.
15 Hannon, J., et al. 2011. "Personalized and automatic social summarization of events in video." In Proceedings of the 16th International Conference on Intelligent User Interfaces, 335-338. Palo Alto, California, USA.
16 Heckner, M., Neubauer, T., & Wolff, C. 2008. "Tree, funny, to read, google: What are tags supposed to achieve?" In Proceedings of the 2008 ACM Workshop on Search in Social Media, 3-10. Napa Valley, California, USA.
17 Hirohata, M., et al. 2006. "Sentence-extractive automatic speech summarization and evaluation techniques." Speech Communication, 48(9): 1151-1161   DOI   ScienceOn
18 Kim, H. 2011. "Toward video semantic search based on a structured folksonomy." Journal of the American Society for Information Science, 62(3): 478-492.
19 Liu, Y., & Hakkani-Tur, D. 2011. "Speech summarization." In Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. G. Edited by Hakkani-Tur and R. Mori. Hoboken, NJ: Wiley, 357-392.
20 김현희. 2009. 비디오의 오디오 정보 요약 기법에 관한 연구. 정보관리학회지, 26(3): 169-188.(Kim, Hyun-Hee. 2009. "Investigating the efficient method for constructing audio surrogates of digital video data." Journal of the Korean Society for Information Management, 26(3): 169-188.)   과학기술학회마을   DOI   ScienceOn
21 김현희. 2011. 비디오 의미 파악을 위한 멀티미디어 요약의 비동시적 오디오와 이미지 정보간의 상호 작용 효과 연구. 한국문헌정보학회지, 45(2): 97-118.(Kim, Hyun-Hee. 2011. "A study on the interactive effect of spoken words and imagery not synchronized in multimedia surrogates for video gisting." Journal of the Korean Society for Library and Information Science, 45(2): 97-118.)   과학기술학회마을   DOI
22 정영미. 2007. 정보검색연구. 서울: 구미무역출판부.(Chung, Young Mee. 2007. Information Retrieval Research. Seoul: Gumi Trading Publisher.)
23 이한성 외. 2010. 멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝. 정보과학회논문지: 데이타베이스, 37(3): 127-136.(Lee, Hansung, et al. 2010. "A news video mining based on multi-modal approach and text mining." Journal of KISS: Databases, 37(3): 127-136.)   과학기술학회마을
24 Zhu, X., Penn, G., & Rudzicz, F. 2009. "Summarizing multiple spoken documents: Finding evidence from untranscribed audio." Proceedings of ACL/AFNLP, 549-557.
25 Maskey, S., & Hirschberg, J. 2006. "Summarizing speech without text using Hidden Markov Models." In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers(NAACL-Short'06), 89-92. Stroudsburg, PA, USA: Association for Computational Linguistics.