• Title/Summary/Keyword: Video understanding

Search Result 324, Processing Time 0.026 seconds

A Practical Digital Video Database based on Language and Image Analysis

  • Liang, Yiqing
    • Proceedings of the Korea Database Society Conference
    • /
    • 1997.10a
    • /
    • pp.24-48
    • /
    • 1997
  • . Supported byㆍDARPA′s image Understanding (IU) program under "Video Retrieval Based on Language and image Analysis" project.DARPA′s Computer Assisted Education and Training Initiative program (CAETI)ㆍObjective: Develop practical systems for automatic understanding and indexing of video sequences using both audio and video tracks(omitted)

  • PDF

Trends in Video Visual Relationship Understanding (비디오 시각적 관계 이해 기술 동향)

  • Y.J. Kwon;D.H. Kim;J.H. Kim;S.C. Oh;J.S. Ham;J.Y. Moon
    • Electronics and Telecommunications Trends
    • /
    • v.38 no.6
    • /
    • pp.12-21
    • /
    • 2023
  • Visual relationship understanding in computer vision allows to recognize meaningful relationships between objects in a scene. This technology enables the extraction of representative information within visual content. We discuss the technology of visual relationship understanding, specifically focusing on videos. We first introduce visual relationship understanding concepts in videos and then explore the latest existing techniques. Next, we present benchmark datasets commonly used in video visual relationship understanding. Finally, we discuss future research directions in video visual relationship understanding.

Object Motion Analysis and Interpretation in Video

  • Song, Dan;Cho, Mi-Young;Kim, Pan-Koo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.694-696
    • /
    • 2004
  • With the more sophisticated abilities development of video, object motion analysis and interpretation has become the fundamental task for the computer vision understanding. For that understanding, firstly, we seek a sum of absolute difference algorithm to apply to the motion detection, which was based on the scene. Then we will focus on the moving objects representation in the scene using spatio-temporal relations. The video can be explained comprehensively from the both aspects : moving objects relations and video events intervals.

  • PDF

Visual Verb and ActionNet Database for Semantic Visual Understanding (동영상 시맨틱 이해를 위한 시각 동사 도출 및 액션넷 데이터베이스 구축)

  • Bae, Changseok;Kim, Bo Kyeong
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.5
    • /
    • pp.19-30
    • /
    • 2018
  • Visual information understanding is known as one of the most difficult and challenging problems in the realization of machine intelligence. This paper proposes deriving visual verb and construction of ActionNet database as a video database for video semantic understanding. Even though development AI (artificial intelligence) algorithms have contributed to the large part of modern advances in AI technologies, huge amount of database for algorithm development and test plays a great role as well. As the performance of object recognition algorithms in still images are surpassing human's ability, research interests shifting to semantic understanding of video contents. This paper proposes candidates of visual verb requiring in the construction of ActionNet as a learning and test database for video understanding. In order to this, we first investigate verb taxonomy in linguistics, and then propose candidates of visual verb from video description database and frequency of verbs. Based on the derived visual verb candidates, we have defined and constructed ActionNet schema and database. According to expanding usability of ActionNet database on open environment, we expect to contribute in the development of video understanding technologies.

Extensible Hierarchical Method of Detecting Interactive Actions for Video Understanding

  • Moon, Jinyoung;Jin, Junho;Kwon, Yongjin;Kang, Kyuchang;Park, Jongyoul;Park, Kyoung
    • ETRI Journal
    • /
    • v.39 no.4
    • /
    • pp.502-513
    • /
    • 2017
  • For video understanding, namely analyzing who did what in a video, actions along with objects are primary elements. Most studies on actions have handled recognition problems for a well-trimmed video and focused on enhancing their classification performance. However, action detection, including localization as well as recognition, is required because, in general, actions intersect in time and space. In addition, most studies have not considered extensibility for a newly added action that has been previously trained. Therefore, proposed in this paper is an extensible hierarchical method for detecting generic actions, which combine object movements and spatial relations between two objects, and inherited actions, which are determined by the related objects through an ontology and rule based methodology. The hierarchical design of the method enables it to detect any interactive actions based on the spatial relations between two objects. The method using object information achieves an F-measure of 90.27%. Moreover, this paper describes the extensibility of the method for a new action contained in a video from a video domain that is different from the dataset used.

Producing Radiotherapy Guidance Movie for patients (방사선치료 안내동영상 제작)

  • Wang, Chul-Hwan;Kang, Seung-Hee;Moon, Bong-Ki;Park, Dong-Wook;Won, Yeong-Jin;Park, Kwang-Hyeon;Kim, Joo-Hyeon;Bang, Seung-Mi
    • Quality Improvement in Health Care
    • /
    • v.19 no.1
    • /
    • pp.56-61
    • /
    • 2013
  • Objectives: This video has been produced to provide better awareness for our patients about radiotherapy treatment for anxiety and stress. This video will give inexperienced patients a better understanding of the processes and expectations of the radiotherapy. We have produced a radiotherapy guidance video regarding work flow and a method of radiotherapy to relieve anxiety and stress. It also improves patients satisfaction and understanding of radiotherapy to provide a high-quality health care for radiotherapy patients with indirect experience. Methods: We have evaluated the effectiveness of the video compared to our existing verbal method. See below for the evaluation criteria; 1) Patients satisfaction rate of guidance 2) a comparison of understanding of radiotherapy 3) a comparison of a time of education for patients 4) a researching of an incidence rate of radiotherapy. Results: When compared to the verbal explanation the patients had a increased level of understanding of the radiotherapy treatment. The time to educate patient was decreased and the level of incidents during the treatment was decreased due to the patient having a better understanding of the whole process. Conclusion : In conclusion, the audiovisual education increased the understanding of radiotherapy for patients compared to verbal education. The video also helped patients to cooperate in treatment room so we can provide premium radiotherapy treatment. By reducing the treatment time and education processa we improved the patients overall experience.

  • PDF

A Study on Flow-emotion-state for Analyzing Flow-situation of Video Content Viewers (영상콘텐츠 시청자의 몰입상황 분석을 위한 몰입감정상태 연구)

  • Kim, Seunghwan;Kim, Cheolki
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.3
    • /
    • pp.400-414
    • /
    • 2018
  • It is required for today's video contents to interact with a viewer in order to provide more personalized experience to viewer(s) than before. In order to do so by providing friendly experience to a viewer from video contents' systemic perspective, understanding and analyzing the situation of the viewer have to be preferentially considered. For this purpose, it is effective to analyze the situation of a viewer by understanding the state of the viewer based on the viewer' s behavior(s) in the process of watching the video contents, and classifying the behavior(s) into the viewer's emotion and state during the flow. The term 'Flow-emotion-state' presented in this study is the state of the viewer to be assumed based on the emotions that occur subsequently in relation to the target video content in a situation which the viewer of the video content is already engaged in the viewing behavior. This Flow-emotion-state of a viewer can be expected to be utilized to identify characteristics of the viewer's Flow-situation by observing and analyzing the gesture and the facial expression that serve as the input modality of the viewer to the video content.

Robot Vision to Audio Description Based on Deep Learning for Effective Human-Robot Interaction (효과적인 인간-로봇 상호작용을 위한 딥러닝 기반 로봇 비전 자연어 설명문 생성 및 발화 기술)

  • Park, Dongkeon;Kang, Kyeong-Min;Bae, Jin-Woo;Han, Ji-Hyeong
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.1
    • /
    • pp.22-30
    • /
    • 2019
  • For effective human-robot interaction, robots need to understand the current situation context well, but also the robots need to transfer its understanding to the human participant in efficient way. The most convenient way to deliver robot's understanding to the human participant is that the robot expresses its understanding using voice and natural language. Recently, the artificial intelligence for video understanding and natural language process has been developed very rapidly especially based on deep learning. Thus, this paper proposes robot vision to audio description method using deep learning. The applied deep learning model is a pipeline of two deep learning models for generating natural language sentence from robot vision and generating voice from the generated natural language sentence. Also, we conduct the real robot experiment to show the effectiveness of our method in human-robot interaction.

The Impact of Video Quality and Image Size on the Effectiveness of Online Video Advertising on YouTube

  • Moon, Jang Ho
    • International Journal of Contents
    • /
    • v.10 no.4
    • /
    • pp.23-29
    • /
    • 2014
  • Online video advertising is now an increasingly important tool for marketers to reach and connect with their consumers. The purpose of this study was to empirically investigate the impact of video format on online video advertising. More specifically, this study aimed to explore whether online video quality and image size influences viewer responses toward online video advertising. By conducting an experimental study on YouTube, the results suggested that enhanced video quality of online advertising may have an important impact on effectiveness of the advertising, and the concept of presence is a key to understanding the effects of enhanced video quality in online advertising.

Social Pedestrian Group Detection Based on Spatiotemporal-oriented Energy for Crowd Video Understanding

  • Huang, Shaonian;Huang, Dongjun;Khuhroa, Mansoor Ahmed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.3769-3789
    • /
    • 2018
  • Social pedestrian groups are the basic elements that constitute a crowd; therefore, detection of such groups is scientifically important for modeling social behavior, as well as practically useful for crowd video understanding. A social group refers to a cluster of members who tend to keep similar motion state for a sustained period of time. One of the main challenges of social group detection arises from the complex dynamic variations of crowd patterns. Therefore, most works model dynamic groups to analysis the crowd behavior, ignoring the existence of stationary groups in crowd scene. However, in this paper, we propose a novel unified framework for detecting social pedestrian groups in crowd videos, including dynamic and stationary pedestrian groups, based on spatiotemporal-oriented energy measurements. Dynamic pedestrian groups are hierarchically clustered based on energy flow similarities and trajectory motion correlations between the atomic groups extracted from principal spatiotemporal-oriented energies. Furthermore, the probability distribution of static spatiotemporal-oriented energies is modeled to detect stationary pedestrian groups. Extensive experiments on challenging datasets demonstrate that our method can achieve superior results for social pedestrian group detection and crowd video classification.