• Title/Summary/Keyword: Video Images

Search Result 1,440, Processing Time 0.028 seconds

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

Inter-view Balanced Disparity Estimation for Mutiview Video Coding (다시점 영상에서 시점간 균형을 맞추는 변이 추정 알고리듬)

  • Yoon, Jae-Won;Kim, Yong-Tae;Sohn, Kwang-Hoon
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.435-436
    • /
    • 2006
  • When working with multi-view images, imbalances between multi-view images occur a serious problem in multi-view video coding because they decrease the performance of disparity estimation. To overcome this problem, we propose inter-view balanced disparity estimation for multi-view video coding. In general, the imbalance problem can be solved by a preprocessing step that transforms reference images linearly. However, there are some problems in pre-processing such as the transformation of the original images. In order to obtain a balancing effect among the views, we perform block-based disparity estimation, which includes several balancing parameters.

  • PDF

Scramble and Descramble Scheme on Multiple Images (다수의 영상에 대한 스크램블 및 디스크램블 방법)

  • Kim Seung-Youl;You Young-Gap
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.6
    • /
    • pp.50-55
    • /
    • 2006
  • This paper presents a scheme which scrambles and descrambles images from multiple video channels. A combined image frame is formed by concatenating the incoming frames from channels in a two dimensional array. This algorithm employs an encryption scheme on row and column numbers of the combined image frame and thereby yields an encrypted combined image. The proposed algorithm is to encrypt multiple images at a time since it recomposes images from multiple video channels yielding one by composite image, and encrypts the composite image resulting In higher security.

  • PDF

Video Sequence Matching Using Normalized Dominant Singular Values

  • Jeong, Kwang-Min;Lee, Joon-Jae
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.6
    • /
    • pp.785-793
    • /
    • 2009
  • This paper proposes a signature using dominant singular values for video sequence matching. By considering the input image as matrix A, a partition procedure is first performed to separate the matrix into non-overlapping sub-images of a fixed size. The SVD(Singular Value Decomposition) process decomposes matrix A into a singular value-singular vector factorization. As a result, singular values are obtained for each sub-image, then k dominant singular values which are sufficient to discriminate between different images and are robust to image size variation, are chosen and normalized as the signature for each block in an image frame for matching between the reference video clip and the query one. Experimental results show that the proposed video signature has a better performance than ordinal signature in ROC curve.

  • PDF

Low-Light Invariant Video Enhancement Scheme Using Zero Reference Deep Curve Estimation (Zero Deep Curve 추정방식을 이용한 저조도에 강인한 비디오 개선 방법)

  • Choi, Hyeong-Seok;Yang, Yoon Gi
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.8
    • /
    • pp.991-998
    • /
    • 2022
  • Recently, object recognition using image/video signals is rapidly spreading on autonomous driving and mobile phones. However, the actual input image/video signals are easily exposed to a poor illuminance environment. A recent researches for improving illumination enable to estimate and compensate the illumination parameters. In this study, we propose VE-DCE (video enhancement zero-reference deep curve estimation) to improve the illumination of low-light images. The proposed VE-DCE uses unsupervised learning-based zero-reference deep curve, which is one of the latest among learning based estimation techniques. Experimental results show that the proposed method can achieve the quality of low-light video as well as images compared to the previous method. In addition, it can reduce the computational complexity with respect to the existing method.

Efficient Image Size Selection for MPEG Video-based Point Cloud Compression

  • Jia, Qiong;Lee, M.K.;Dong, Tianyu;Kim, Kyu Tae;Jang, Euee S.
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.825-828
    • /
    • 2022
  • In this paper, we propose an efficient image size selection method for video-based point cloud compression. The current MPEG video-based point cloud compression reference encoding process configures a threshold on the size of images while converting point cloud data into images. Because the converted image is compressed and restored by the legacy video codec, the size of the image is one of the main components in influencing the compression efficiency. If the image size can be made smaller than the image size determined by the threshold, compression efficiency can be improved. Here, we studied how to improve the compression efficiency by selecting the best-fit image size generated during video-based point cloud compression. Experimental results show that the proposed method can reduce the encoding time by 6 percent without loss of coding performance compared to the test model 15.0 version of video-based point cloud encoder.

  • PDF

View Synthesis Error Removal for Comfortable 3D Video Systems (편안한 3차원 비디오 시스템을 위한 영상 합성 오류 제거)

  • Lee, Cheon;Ho, Yo-Sung
    • Smart Media Journal
    • /
    • v.1 no.3
    • /
    • pp.36-42
    • /
    • 2012
  • Recently, the smart applications, such as smart phone and smart TV, become a hot issue in IT consumer markets. In particular, the smart TV provides 3D video services, hence efficient coding methods for 3D video data are required. Three-dimensional (3D) video involves stereoscopic or multi-view images to provide depth experience through 3D display systems. Binocular cues are perceived by rendering proper viewpoint images obtained at slightly different view angles. Since the number of viewpoints of the multi-view video is limited, 3D display devices should generate arbitrary viewpoint images using available adjacent view images. In this paper, after we explain a view synthesis method briefly, we propose a new algorithm to compensate view synthesis errors around object boundaries. We describe a 3D warping technique exploiting the depth map for viewpoint shifting and a hole filling method using multi-view images. Then, we propose an algorithm to remove boundary noises that are generated due to mismatches of object edges in the color and depth images. The proposed method reduces annoying boundary noises near object edges by replacing erroneous textures with alternative textures from the other reference image. Using the proposed method, we can generate perceptually inproved images for 3D video systems.

  • PDF

Fast Extraction of Objects of Interest from Images with Low Depth of Field

  • Kim, Chang-Ick;Park, Jung-Woo;Lee, Jae-Ho;Hwang, Jenq-Neng
    • ETRI Journal
    • /
    • v.29 no.3
    • /
    • pp.353-362
    • /
    • 2007
  • In this paper, we propose a novel unsupervised video object extraction algorithm for individual images or image sequences with low depth of field (DOF). Low DOF is a popular photographic technique which enables the representation of the photographer's intention by giving a clear focus only on an object of interest (OOI). We first describe a fast and efficient scheme for extracting OOIs from individual low-DOF images and then extend it to deal with image sequences with low DOF in the next part. The basic algorithm unfolds into three modules. In the first module, a higher-order statistics map, which represents the spatial distribution of the high-frequency components, is obtained from an input low-DOF image. The second module locates the block-based OOI for further processing. Using the block-based OOI, the final OOI is obtained with pixel-level accuracy. We also present an algorithm to extend the extraction scheme to image sequences with low DOF. The proposed system does not require any user assistance to determine the initial OOI. This is possible due to the use of low-DOF images. The experimental results indicate that the proposed algorithm can serve as an effective tool for applications, such as 2D to 3D and photo-realistic video scene generation.

  • PDF

A Study on Super Resolution Image Reconstruction for Effective Spatial Identification

  • Park Jae-Min;Jung Jae-Seung;Kim Byung-Guk
    • Spatial Information Research
    • /
    • v.13 no.4 s.35
    • /
    • pp.345-354
    • /
    • 2005
  • Super resolution image reconstruction method refers to image processing algorithms that produce a high resolution(HR) image from observed several low resolution(LR) images of the same scene. This method has proven to be useful in many practical cases where multiple frames of the same scene can be obtained, such as satellite imaging, video surveillance, video enhancement and restoration, digital mosaicking, and medical imaging. In this paper, we applied the super resolution reconstruction method in spatial domain to video sequences. Test images are adjacently sampled images from continuous video sequences and are overlapped at high rate. We constructed the observation model between the HR images and LR images applied with the Maximum A Posteriori(MAP) reconstruction method which is one of the major methods in the super resolution grid construction. Based on the MAP method, we reconstructed high resolution images from low resolution images and compared the results with those from other known interpolation methods.

  • PDF

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

  • Zhou, Xuan
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.337-351
    • /
    • 2021
  • Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.