• Title/Summary/Keyword: spatio-temporal pyramid

Search Result 3, Processing Time 0.016 seconds

On the Hybrid Prediction Pyramid Compatible Coding Technique (혼성 예측 피라미드 호환 부호화 기법)

  • 이준서;이상욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.1
    • /
    • pp.33-46
    • /
    • 1996
  • Inthis paper, we investigate the compatible coding technique, which receives much interest ever since the introduction of HDTV. First, attempts have been made to analyze the theoretical transform coding gains for various hierarchical decomposition techniques, namely subband, pyramid and DCT-based decomposition techniques. It is shown that the spatical domain techniques proide higher transform coding gains than the DCT-based coding technique. Secondly, we compare the performance of these spatial domain techniques, in terms of the PSNR versus various rate allocations to each layer. Based on these analyses, it is believed that the pyramid decomposition is more appropriate for the compatible coding. Also in this paper, we propose a hybrid prediction pyramid coding technique, by combining the spatio-temporal prediction in MPEG-2[3] and the adaptive MC(Motion Compensation)[1]. In the proposed coding technigue, we also employ an adaptive DCT coefficient scanning technique to exploit the direction information of the 2nd-layer signal. Through computer simulations, the proposed hybrid prediction with adaptive scanning technuque shows the PSNR improvement, by about 0.46-1.78dB at low 1st-layer rate(about 0.1bpp) over the adaptive MC[1], and by about 0.33-0.63dB at high 1st-layer rate (about 0.32-0.43bpp) over the spatio-temporal prediction[3].

  • PDF

Object-Based Video Segmentation Using Spatio-temporal Entropic Thresholding and Camera Panning Compensation (시공간 엔트로피 임계법과 카메라 패닝 보상을 이용한 객체 기반 동영상 분할)

  • 백경환;곽노윤
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.4 no.3
    • /
    • pp.126-133
    • /
    • 2003
  • This paper is related to a morphological segmentation method for extracting the moving object in video sequence using global motion compensation and two-dimensional spatio-temporal entropic thresholding. First, global motion compensation is performed with camera panning vector estimated in the hierarchical pyramid structure constructed by wavelet transform. Secondly, the regions with high possibility to include the moving object between two consecutive frames are extracted block by block from the global motion compensated image using two-dimensional spatio-temporal entropic thresholding. Afterwards, the LUT classifying each block into one among changed block, uncertain block, stationary block according to the results classified by two-dimensional spatio-temporal entropic thresholding is made out. Next, by adaptively selecting the initial search layer and the search range referring to the LUT, the proposed HBMA can effectively carry out fast motion estimation and extract object-included region in the hierarchical pyramid structure. Finally, after we define the thresholded gradient image in the object-included region, and apply the morphological segmentation method to the object-included region pixel by pixel and extract the moving object included in video sequence. As shown in the results of computer simulation, the proposed method provides relatively good segmentation results for moving object and specially comes up with reasonable segmentation results in the edge areas with lower contrast.

  • PDF

A Tree Regularized Classifier-Exploiting Hierarchical Structure Information in Feature Vector for Human Action Recognition

  • Luo, Huiwu;Zhao, Fei;Chen, Shangfeng;Lu, Huanzhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1614-1632
    • /
    • 2017
  • Bag of visual words is a popular model in human action recognition, but usually suffers from loss of spatial and temporal configuration information of local features, and large quantization error in its feature coding procedure. In this paper, to overcome the two deficiencies, we combine sparse coding with spatio-temporal pyramid for human action recognition, and regard this method as the baseline. More importantly, which is also the focus of this paper, we find that there is a hierarchical structure in feature vector constructed by the baseline method. To exploit the hierarchical structure information for better recognition accuracy, we propose a tree regularized classifier to convey the hierarchical structure information. The main contributions of this paper can be summarized as: first, we introduce a tree regularized classifier to encode the hierarchical structure information in feature vector for human action recognition. Second, we present an optimization algorithm to learn the parameters of the proposed classifier. Third, the performance of the proposed classifier is evaluated on YouTube, Hollywood2, and UCF50 datasets, the experimental results show that the proposed tree regularized classifier obtains better performance than SVM and other popular classifiers, and achieves promising results on the three datasets.