DOI QR코드

DOI QR Code

Toward a Structural and Semantic Metadata Framework for Efficient Browsing and Searching of Web Videos

  • Kim, Hyun-Hee (Department of Library and Information Science, Myongji University)
  • Received : 2017.01.23
  • Accepted : 2017.02.10
  • Published : 2017.02.28

Abstract

This study proposed a structural and semantic framework for the characterization of events and segments in Web videos that permits content-based searches and dynamic video summarization. Although MPEG-7 supports multimedia structural and semantic descriptions, it is not currently suitable for describing multimedia content on the Web. Thus, the proposed metadata framework that was designed considering Web environments provides a thorough yet simple way to describe Web video contents. Precisely, the metadata framework was constructed on the basis of Chatman's narrative theory, three multimedia metadata formats (PBCore, MPEG-7, and TV-Anytime), and social metadata. It consists of event information, eventGroup information, segment information, and video (program) information. This study also discusses how to automatically extract metadata elements including structural and semantic metadata elements from Web videos.

Keywords

References

  1. Kim, Yong-Ho. 2009. "A Structural Model of Mediated Visual Communication in Narrative Movies: Focusing on Chatman and Bordwell's Controversy." Korean Journal of Journalism & Communication Studies, 53(1): 209-232.
  2. Cho, Young-Joon. 2014. "The Study on improvement of Broadcast Metadata about Clip Video at Broadcast Content Managements." In Proceedings of 2014 Korean Society of Broadcast Engineers Summer Conference, June 30-July 2, 2014, Jeju: Jeju National University: 59-63.
  3. Agnew, G., Kniesner, D. and Weber, M. B. 2007. "Integrating MPEG-7 into the Moving Image Collections Portal." Journal of the American Society for Information Science and Technology, 58(9): 1357-1363. https://doi.org/10.1002/asi.20578
  4. Algur, S. P., Bhat, P. and Jain, S. 2014. "Metadata Construction Model for Web Videos: A Domain Specific Approach." International Journal of Engineering and Computer Science, 3(12): 9742-9748.
  5. Arndt, R. et al. 2007. "COMM: Designing a Well-Founded Multimedia Ontology for the Web." In Proceedings of the 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, November 11-15, 2007, Busan: BEXCO.
  6. Behroozi, M., Daliri, M. R. and Shekarchi, B. 2016. "EEG Phase Patterns Reflect the Representation of Semantic Categories of Objects." Medical & Biological Engineering & Computing, 54(1): 205-221. https://doi.org/10.1007/s11517-015-1391-7
  7. Benitez, A. B., Zhong, D. and Chang, S. F. 2007. "Enabling MPEG-7 Structural and Semantic Descriptions in Retrieval Applications." Journal of the Association for Information Science and Technology, 58(9): 1377-1380. https://doi.org/10.1002/asi.20582
  8. Bouadjenek, M. R., Hacid, H. and Bouzeghoub, M. 2016. "Social Networks and Information Retrieval, How Are They Converging? A Survey, a Taxonomy and an Analysis of Social Information Retrieval Approaches and Platforms." Information Systems, 56(2016): 1-18. https://doi.org/10.1016/j.is.2015.07.008
  9. Chatman, S. 1975. "Towards a Theory of Narrative." New Literary History, 6(2): 295-318. https://doi.org/10.2307/468421
  10. Chen, F., Delannay, D. and De Vleeschouwer, C. 2011. "An Autonomous Framework to Produce and Distribute Personalized Team-Sport Video Summaries: A Basketball Case Study." IEEE Transactions on Multimedia, 13(6): 1381-1394. https://doi.org/10.1109/TMM.2011.2166379
  11. Christel, M. G. 2009. Automated Metadata in Multimedia Information Systems: Creation, Refinement, Use in Surrogates, and Evaluation. Synthesis Lecture on Information Concepts, Retrieval, and Services, 2. San Rafael, CA: Morgan & Claypool Publishers.
  12. Cunningham, S. J. and Nichols, D. M. 2008. "How People Find Videos." In Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries, June 16-20, 2008, Pittsburgh, PA: Omni William Penn Hotel: 201-210.
  13. Evain, J. P. and Martínez, J. M. 2007. "TV-Anytime Phase 1 and MPEG-7." Journal of the American Society for Information Science and Technology, 58(9): 1367-1373. https://doi.org/10.1002/asi.20580
  14. International Organization for Standardization/International Electrotechnical Commission (ISO/IEC). 2002-2004. ISO/IEC 15938: Part 1-8: Information Technology: Multimedia Content Description Interface (MPEG-7). Geneva: International Organization for Standardization.
  15. Huurnink, B. et al. 2010. "Search Behavior of Media Professionals at an Audiovisual Archive: A Transaction Log Analysis." Journal of the Association for Information Science and Technology, 61(6): 1180-1197.
  16. Klix, F. 2001. "The Evolution of Cognition." Journal of Structural Learning and Intelligence Systems, 14: 415-431.
  17. Lee, H. K. et al. 2005. "Personalized TV Services and T-Learning Based on TV-Anytime Metadata." In Proceedings of the 6th Pacific-Rim Conference on Multimedia, November 13-16, 2005, Jeju: Ramada Plaza Jeju Hotel: 212-223.
  18. List, T. and Fisher, R. B. 2004. "CVML-An XML-based Computer Vision Markup Language." In Proceedings of the 17th International Conference on Pattern Recognition, August 23-26, 2004, Cambridge: 789-792.
  19. Lunn, B. K. 2009. Towards the Design of User based Metadata for Television Broadcasts. Saarbrucken: VDM Verlag.
  20. Makkonen, J. et al. 2010. "Detecting Events by Clustering Videos from Large Media Databases." In Proceedings of the 2nd ACM International Workshop on Events in Multimedia, October 25, 2010, Firenze: 9-14.
  21. Mehmood, I. et al. 2016. "Divide-and-Conquer based Summarization Framework for Extracting Affective Video Content." Neurocomputing, 174(A): 393-403. https://doi.org/10.1016/j.neucom.2015.05.126
  22. The Moving Picture Experts Group (MPEG). [n.d.]. MPEG. Villar Dora: The Moving Picture Experts Group. [online] [cited 2016. 9. 11.]
  23. Nowak, M. A., Plotkin, J. B. and Jansen, V. A. 2000. "The Evolution of Syntactic Communication." Nature, 404(6777): 495-498. https://doi.org/10.1038/35006635
  24. Park, J. R. and Lu, C. 2009. "Application of Semi-Automatic Metadata Generation in Libraries: Types, Tools, and Techniques." Library & Information Science Research, 31(4): 225-231. https://doi.org/10.1016/j.lisr.2009.05.002
  25. PBCore. [n.d.]. PBCore. [online] [cited 2016. 9. 3.]
  26. Reijnders, K. 2011. Suspense Tours: Narrative Generation in the Context of Tourism. Amsterdam: Universiteit van Amsterdam.
  27. Shotton, D. M. et al. 2002. "A Metadata Classification Schema for Semantic Content Analysis of Videos." Journal of Microscopy, 205(1): 33-42. https://doi.org/10.1046/j.0022-2720.2001.00966.x
  28. Smeaton, A. F., Over, P. and Doherty, A. R. 2010. "Video Shot Boundary Detection: Seven Years of TRECVid Activity." Computer Vision and Image Understanding, 114(4): 411-418. https://doi.org/10.1016/j.cviu.2009.03.011
  29. Teeter, P. and Sandberg, J. 2016. "Cracking the Enigma of Asset Bubbles with Narratives." Strategic Organization, 15(1): 91-99.
  30. Togawa, H. and Okuda, M. 2005. "Position-Based Keyframe Selection for Human Motion Animation." In Proceedings of 11th International Conference on Parallel and Distributed Systems, July 20-22, 2005, Fukuoka: 182-185.
  31. TV-Anytime Forum. 2005. TV Anytime Forum. [online] [cited 2016. 5. 16.]
  32. Wang, M. et al. 2012. "Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification." IEEE Transactions on Multimedia, 14(4): 975-985. https://doi.org/10.1109/TMM.2012.2185041
  33. Yokoi, K., Nakai, H. and Sato, T. 2008. "Toshiba at TRECVID 2008: Surveillance Event Detection Task." In Proceedings of the TRECVID 2008 Workshop, November 17-18, 2008, Gaithersburg, MD: National Institute of Standards and Technology.