Search | Korea Science

Online Video Synopsis via Multiple Object Detection

Lee, JaeWon;Kim, DoHyeon;Kim, Yoon
- Journal of the Korea Society of Computer and Information
- /
- v.24 no.8
- /
- pp.19-28
- /
- 2019
In this paper, an online video summarization algorithm based on multiple object detection is proposed. As crime has been on the rise due to the recent rapid urbanization, the people's appetite for safety has been growing and the installation of surveillance cameras such as a closed-circuit television(CCTV) has been increasing in many cities. However, it takes a lot of time and labor to retrieve and analyze a huge amount of video data from numerous CCTVs. As a result, there is an increasing demand for intelligent video recognition systems that can automatically detect and summarize various events occurring on CCTVs. Video summarization is a method of generating synopsis video of a long time original video so that users can watch it in a short time. The proposed video summarization method can be divided into two stages. The object extraction step detects a specific object in the video and extracts a specific object desired by the user. The video summary step creates a final synopsis video based on the objects extracted in the previous object extraction step. While the existed methods do not consider the interaction between objects from the original video when generating the synopsis video, in the proposed method, new object clustering algorithm can effectively maintain interaction between objects in original video in synopsis video. This paper also proposed an online optimization method that can efficiently summarize the large number of objects appearing in long-time videos. Finally, Experimental results show that the performance of the proposed method is superior to that of the existing video synopsis algorithm.
https://doi.org/10.9708/jksci.2019.24.08.019 인용 PDF KSCI

Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM

Kamal, Shaharyar;Jalal, Ahmad;Kim, Daijin
- Journal of Electrical Engineering and Technology
- /
- v.11 no.6
- /
- pp.1857-1862
- /
- 2016
Human activity recognition using depth information is an emerging and challenging technology in computer vision due to its considerable attention by many practical applications such as smart home/office system, personal health care and 3D video games. This paper presents a novel framework of 3D human body detection, tracking and recognition from depth video sequences using spatiotemporal features and modified HMM. To detect human silhouette, raw depth data is examined to extract human silhouette by considering spatial continuity and constraints of human motion information. While, frame differentiation is used to track human movements. Features extraction mechanism consists of spatial depth shape features and temporal joints features are used to improve classification performance. Both of these features are fused together to recognize different activities using the modified hidden Markov model (M-HMM). The proposed approach is evaluated on two challenging depth video datasets. Moreover, our system has significant abilities to handle subject's body parts rotation and body parts missing which provide major contributions in human activity recognition.
https://doi.org/10.5370/JEET.2016.11.6.1857 인용 PDF KSCI

Character Recognition and Search for Media Editing (미디어 편집을 위한 인물 식별 및 검색 기법)

Park, Yong-Suk;Kim, Hyun-Sik
- Journal of Broadcast Engineering
- /
- v.27 no.4
- /
- pp.519-526
- /
- 2022
Identifying and searching for characters appearing in scenes during multimedia video editing is an arduous and time-consuming process. Applying artificial intelligence to labor-intensive media editing tasks can greatly reduce media production time, improving the creative process efficiency. In this paper, a method is proposed which combines existing artificial intelligence based techniques to automate character recognition and search tasks for video editing. Object detection, face detection, and pose estimation are used for character localization and face recognition and color space analysis are used to extract unique representation information.
https://doi.org/10.5909/JBE.2022.27.4.519 인용 PDF KSCI KPUBS

New developmental direction of telecommunications for Disabilities Welfare (장애인복지를 위한 정보통신의 발전방향)

박민수
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.4 no.1
- /
- pp.35-43
- /
- 2000
This paper was studied on developmental direction of telecommunications for disabilities welfare. Method of this study is delphi method. Persons with disabilities is classed as motor disability, visual handicap, hearing impairment, and language and speech disorders. Persons with motor disability is needs as follow, speed recognition technology, video recognition technology, breath capacity recognition technology. Persons with visual handicap is needs as follow, display recognition technology, speed recognition technology, text recognition technology, intelligence conversion handling technology, video recognition - speed synthetic technology. Persons with hearing impairment and language - speech disorders is needs as follow, speed signal handling technology, speed recognition technology, intelligence conversion handling technology, video recognition technology, speed synthetic technology the results of this study is as follow: first, disabilities telecommunications organization must be constructed. Second, persons with disabilities in need of universal service. Third, Persons with disabilities in need of information education, Fourth, studying for telecommunications in need of support. Fifth, small telecommunications company in need of support. Sixth, software industry in need of new development. Seventh, Persons with disabilities in need of standard guideline for telecommunications.
PDF

Image Processing for Video Images of Buoy Motion

Kim, Baeck-Oon;Cho, Hong-Yeon
- Ocean Science Journal
- /
- v.40 no.4
- /
- pp.213-220
- /
- 2005
In this paper, image processing technique that reduces video images of buoy motion to yield time series of image coordinates of buoy objects will be investigated. The buoy motion images are noisy due to time-varying brightness as well as non-uniform background illumination. The occurrence of boats, wakes, and wind-induced white caps interferes significantly in recognition of buoy objects. Thus, semi-automated procedures consisting of object recognition and image measurement aspects will be conducted. These offer more satisfactory results than a manual process. Spectral analysis shows that the image coordinates of buoy objects represent wave motion well, indicating its usefulness in the analysis of wave characteristics.
PDF KSCI

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

Sediqi, Khwaja Monib;Lee, Hyo Jong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.05a
- /
- pp.455-457
- /
- 2019
In this paper we study the effect of decomposed spatiotemporal convolutions for action recognition in videos. Our motivation emerges from the empirical observation that spatial convolution applied on solo frames of the video provide good performance in action recognition. In this research we empirically show the accuracy of factorized convolution on individual frames of video for action classification. We take 3D ResNet-18 as base line model for our experiment, factorize its 3D convolution to 2D (Spatial) and 1D (Temporal) convolution. We train the model from scratch using Kinetics video dataset. We then fine-tune the model on UCF-101 dataset and evaluate the performance. Our results show good accuracy similar to that of the state of the art algorithms on Kinetics and UCF-101 datasets.
https://doi.org/10.3745/PKIPS.y2019m05a.455 인용 PDF

Video augmentation technique for human action recognition using genetic algorithm

Nida, Nudrat;Yousaf, Muhammad Haroon;Irtaza, Aun;Velastin, Sergio A.
- ETRI Journal
- /
- v.44 no.2
- /
- pp.327-338
- /
- 2022
Classification models for human action recognition require robust features and large training sets for good generalization. However, data augmentation methods are employed for imbalanced training sets to achieve higher accuracy. These samples generated using data augmentation only reflect existing samples within the training set, their feature representations are less diverse and hence, contribute to less precise classification. This paper presents new data augmentation and action representation approaches to grow training sets. The proposed approach is based on two fundamental concepts: virtual video generation for augmentation and representation of the action videos through robust features. Virtual videos are generated from the motion history templates of action videos, which are convolved using a convolutional neural network, to generate deep features. Furthermore, by observing an objective function of the genetic algorithm, the spatiotemporal features of different samples are combined, to generate the representations of the virtual videos and then classified through an extreme learning machine classifier on MuHAVi-Uncut, iXMAS, and IAVID-1 datasets.
https://doi.org/10.4218/etrij.2019-0510 인용 PDF KSCI

Recognition of Human Facial Expression in a Video Image using the Active Appearance Model

Jo, Gyeong-Sic;Kim, Yong-Guk
- Journal of Information Processing Systems
- /
- v.6 no.2
- /
- pp.261-268
- /
- 2010
Tracking human facial expression within a video image has many useful applications, such as surveillance and teleconferencing, etc. Initially, the Active Appearance Model (AAM) was proposed for facial recognition; however, it turns out that the AAM has many advantages as regards continuous facial expression recognition. We have implemented a continuous facial expression recognition system using the AAM. In this study, we adopt an independent AAM using the Inverse Compositional Image Alignment method. The system was evaluated using the standard Cohn-Kanade facial expression database, the results of which show that it could have numerous potential applications.
https://doi.org/10.3745/JIPS.2010.6.2.261 인용 PDF KSCI

Temporal matching prior network for vehicle license plate detection and recognition in videos

Yoo, Seok Bong;Han, Mikyong
- ETRI Journal
- /
- v.42 no.3
- /
- pp.411-419
- /
- 2020
In real-world intelligent transportation systems, accuracy in vehicle license plate detection and recognition is considered quite critical. Many algorithms have been proposed for still images, but their accuracy on actual videos is not satisfactory. This stems from several problematic conditions in videos, such as vehicle motion blur, variety in viewpoints, outliers, and the lack of publicly available video datasets. In this study, we focus on these challenges and propose a license plate detection and recognition scheme for videos based on a temporal matching prior network. Specifically, to improve the robustness of detection and recognition accuracy in the presence of motion blur and outliers, forward and bidirectional matching priors between consecutive frames are properly combined with layer structures specifically designed for plate detection. We also built our own video dataset for the deep training of the proposed network. During network training, we perform data augmentation based on image rotation to increase robustness regarding the various viewpoints in videos.
https://doi.org/10.4218/etrij.2019-0245 인용 PDF KSCI

Collaborative Place and Object Recognition in Video using Bidirectional Context Information (비디오에서 양방향 문맥 정보를 이용한 상호 협력적인 위치 및 물체 인식)

Kim, Sung-Ho;Kweon, In-So
- The Journal of Korea Robotics Society
- /
- v.1 no.2
- /
- pp.172-179
- /
- 2006
In this paper, we present a practical place and object recognition method for guiding visitors in building environments. Recognizing places or objects in real world can be a difficult problem due to motion blur and camera noise. In this work, we present a modeling method based on the bidirectional interaction between places and objects for simultaneous reinforcement for the robust recognition. The unification of visual context including scene context, object context, and temporal context is also. The proposed system has been tested to guide visitors in a large scale building environment (10 topological places, 80 3D objects).
PDF

Search Result 681, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)