• Title/Summary/Keyword: Video sequence

Search Result 507, Processing Time 0.023 seconds

Caption Extraction in News Video Sequence using Frequency Characteristic

  • Youglae Bae;Chun, Byung-Tae;Seyoon Jeong
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.835-838
    • /
    • 2000
  • Popular methods for extracting a text region in video images are in general based on analysis of a whole image such as merge and split method, and comparison of two frames. Thus, they take long computing time due to the use of a whole image. Therefore, this paper suggests the faster method of extracting a text region without processing a whole image. The proposed method uses line sampling methods, FFT and neural networks in order to extract texts in real time. In general, text areas are found in the higher frequency domain, thus, can be characterized using FFT The candidate text areas can be thus found by applying the higher frequency characteristics to neural network. Therefore, the final text area is extracted by verifying the candidate areas. Experimental results show a perfect candidate extraction rate and about 92% text extraction rate. The strength of the proposed algorithm is its simplicity, real-time processing by not processing the entire image, and fast skipping of the images that do not contain a text.

  • PDF

A Segmentation Method for a Moving Object on A Static Complex Background Scene. (복잡한 배경에서 움직이는 물체의 영역분할에 관한 연구)

  • Park, Sang-Min;Kwon, Hui-Ung;Kim, Dong-Sung;Jeong, Kyu-Sik
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.3
    • /
    • pp.321-329
    • /
    • 1999
  • Moving Object segmentation extracts an interested moving object on a consecutive image frames, and has been used for factory automation, autonomous navigation, video surveillance, and VOP(Video Object Plane) detection in a MPEG-4 method. This paper proposes new segmentation method using difference images are calculated with three consecutive input image frames, and used to calculate both coarse object area(AI) and it's movement area(OI). An AI is extracted by removing background using background area projection(BAP). Missing parts in the AI is recovered with help of the OI. Boundary information of the OI confines missing parts of the object and gives inital curves for active contour optimization. The optimized contours in addition to the AI make the boundaries of the moving object. Experimental results of a fast moving object on a complex background scene are included.

  • PDF

Trends in Online Action Detection in Streaming Videos (온라인 행동 탐지 기술 동향)

  • Moon, J.Y.;Kim, H.I.;Lee, Y.J.
    • Electronics and Telecommunications Trends
    • /
    • v.36 no.2
    • /
    • pp.75-82
    • /
    • 2021
  • Online action detection (OAD) in a streaming video is an attractive research area that has aroused interest lately. Although most studies for action understanding have considered action recognition in well-trimmed videos and offline temporal action detection in untrimmed videos, online action detection methods are required to monitor action occurrences in streaming videos. OAD predicts action probabilities for a current frame or frame sequence using a fixed-sized video segment, including past and current frames. In this article, we discuss deep learning-based OAD models. In addition, we investigated OAD evaluation methodologies, including benchmark datasets and performance measures, and compared the performances of the presented OAD models.

Intelligent Activity Recognition based on Improved Convolutional Neural Network

  • Park, Jin-Ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.807-818
    • /
    • 2022
  • In order to further improve the accuracy and time efficiency of behavior recognition in intelligent monitoring scenarios, a human behavior recognition algorithm based on YOLO combined with LSTM and CNN is proposed. Using the real-time nature of YOLO target detection, firstly, the specific behavior in the surveillance video is detected in real time, and the depth feature extraction is performed after obtaining the target size, location and other information; Then, remove noise data from irrelevant areas in the image; Finally, combined with LSTM modeling and processing time series, the final behavior discrimination is made for the behavior action sequence in the surveillance video. Experiments in the MSR and KTH datasets show that the average recognition rate of each behavior reaches 98.42% and 96.6%, and the average recognition speed reaches 210ms and 220ms. The method in this paper has a good effect on the intelligence behavior recognition.

Interframe interpolation technique based on variable skip rate (가변 스킵율 기반의 프레임간 보간 기법)

  • Kim, Dong-wook;Choi, Yeon-sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.3B
    • /
    • pp.510-518
    • /
    • 2000
  • A new video interpolation technique based on variable skip rate of video sequence is proposed in this paper. in the proposed technique. the determination whether a frame is skipped or not is done by the degree of motion complexity of the frame. If the motion complexity of a frame is low the frame is skipped. otherwise it is coded and transmitted. To determine the motion complexity of a frame a new technique using MEF (moving edge in frame),the set of pixels considered as moving edges in a frame. is introduced. In the course of decoding and interpolating of receiver., the motion field is segmented. For the purpose of dividing vector field morphological filtering is applied. Morphological filtering also used to smooth the boundaries between the changed and unchanged region. In the simulation results, the proposed technique shows higher quality and lower fluctuation of picture quality than the conventional techniques on conditioning of the same bit rate.

  • PDF

A Depth Mapping Method for 3DoF+ Video Coding (3DoF+ 비디오 부호화를 위한 깊이 매핑 기법)

  • Park, Ji-Hun;Lee, Jun-Sung;Park, Dohyeon;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.295-296
    • /
    • 2020
  • 3DoF+ 비디오 부호화 표준을 개발하고 있는 MPEG-I 비주얼 그룹은 표준화 과정에서 참조 SW 코덱인 TMIV(Test Model for Immersive Video)를 개발하고 있다. TMIV 는 제한된 공간에서 동시에 여러 위치에서 획득한 뷰(view)의 텍스처(texture) 비디오와 깊이(depth) 비디오를 효율적으로 압축하여 임의 시점의 뷰 렌더링(rendering)을 제공한다. TMIV 에서 수행되는 깊이 비디오의 비트 심도 스케일링 및 압축은 깊이 정보의 손실을 발생하며 이는 렌더링(rendering)된 임의 시점 비디오의 화질 저하를 야기한다. 본 논문에서는 보다 효율적인 깊이 비디오 압축을 위한 히스토그램 등화(histogram equalization) 기반의 구간별(piece-wise) 깊이 매핑 기법을 제안한다. 실험결과 제안기법은 자연 영상(natural sequence)의 End-to-End 부호화 성능에서 평균적으로 3.1%의 비트율 절감이 있음을 확인하였다.

  • PDF

Human Activity Recognition Using Spatiotemporal 3-D Body Joint Features with Hidden Markov Models

  • Uddin, Md. Zia;Kim, Jaehyoun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2767-2780
    • /
    • 2016
  • Video-based human-activity recognition has become increasingly popular due to the prominent corresponding applications in a variety of fields such as computer vision, image processing, smart-home healthcare, and human-computer interactions. The essential goals of a video-based activity-recognition system include the provision of behavior-based information to enable functionality that proactively assists a person with his/her tasks. The target of this work is the development of a novel approach for human-activity recognition, whereby human-body-joint features that are extracted from depth videos are used. From silhouette images taken at every depth, the direction and magnitude features are first obtained from each connected body-joint pair so that they can be augmented later with motion direction, as well as with the magnitude features of each joint in the next frame. A generalized discriminant analysis (GDA) is applied to make the spatiotemporal features more robust, followed by the feeding of the time-sequence features into a Hidden Markov Model (HMM) for the training of each activity. Lastly, all of the trained-activity HMMs are used for depth-video activity recognition.

Energy Minimization Based Semantic Video Object Extraction

  • Kim, Dong-Hyun;Choi, Sung-Hwan;Kim, Bong-Joe;Shin, Hyung-Chul;Sohn, Kwang-Hoon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.138-141
    • /
    • 2010
  • In this paper, we propose a semi-automatic method for semantic video object extraction which extracts meaningful objects from an input sequence with one correctly segmented training image. Given one correctly segmented image acquired by the user's interaction in the first frame, the proposed method automatically segments and tracks the objects in the following frames. We formulate the semantic object extraction procedure as an energy minimization problem at the fragment level instead of pixel level. The proposed energy function consists of two terms: data term and smoothness term. The data term is computed by considering patch similarity, color, and motion information. Then, the smoothness term is introduced to enforce the spatial continuity. Finally, iterated conditional modes (ICM) optimization is used to minimize energy function in a globally optimal manner. The proposed semantic video object extraction method provides faithful results for various types of image sequences.

  • PDF

A New Coding Method for Improving the Performance of MPEG-4 Part 10 Video Coding Standard (MPEG-4 Part 10 동영상 압축 표준 성능 개선을 위한 새로운 부호화 방식)

  • Moon, Yong-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.11C
    • /
    • pp.1058-1065
    • /
    • 2006
  • In this paper, we propose a new motion vector coding algorithm suitable for MPEG-4 Part 10 video coding standard. In the proposed algorithm, the amount of the motion for a given video sequence is Predicted by using a characteristic of the motion vector distribution for the neighboring blocks as well as the MB_type, which is the syntax element in the standard. And one of the independent coding and the combined coding methods is adaptively employed to compress the motion vector difference. Simulation results show that the proposed algorithm outperforms the conventional methods without additional memory and calculations.

Real-time Face Localization for Video Monitoring (무인 영상 감시 시스템을 위한 실시간 얼굴 영역 추출 알고리즘)

  • 주영현;이정훈;문영식
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.11
    • /
    • pp.48-56
    • /
    • 1998
  • In this paper, a moving object detection and face region extraction algorithm which can be used in video monitoring systems is presented. The proposed algorithm is composed of two stages. In the first stage, each frame of an input video sequence is analyzed using three measures which are based on image pixel difference. If the current frame contains moving objects, their skin regions are extracted using color and frame difference information in the second stage. Since the proposed algorithm does not rely on computationally expensive features like optical flow, it is well suited for real-time applications. Experimental results tested on various sequences have shown the robustness of the proposed algorithm.

  • PDF