• Title/Summary/Keyword: Video Synthesis

Search Result 116, Processing Time 0.019 seconds

Style Synthesis of Speech Videos Through Generative Adversarial Neural Networks (적대적 생성 신경망을 통한 얼굴 비디오 스타일 합성 연구)

  • Choi, Hee Jo;Park, Goo Man
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.11
    • /
    • pp.465-472
    • /
    • 2022
  • In this paper, the style synthesis network is trained to generate style-synthesized video through the style synthesis through training Stylegan and the video synthesis network for video synthesis. In order to improve the point that the gaze or expression does not transfer stably, 3D face restoration technology is applied to control important features such as the pose, gaze, and expression of the head using 3D face information. In addition, by training the discriminators for the dynamics, mouth shape, image, and gaze of the Head2head network, it is possible to create a stable style synthesis video that maintains more probabilities and consistency. Using the FaceForensic dataset and the MetFace dataset, it was confirmed that the performance was increased by converting one video into another video while maintaining the consistent movement of the target face, and generating natural data through video synthesis using 3D face information from the source video's face.

3D View Synthesis with Feature-Based Warping

  • Hu, Ningning;Zhao, Yao;Bai, Huihui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.11
    • /
    • pp.5506-5521
    • /
    • 2017
  • Three-dimensional video (3DV), as the new generation of video format standard, can provide the viewers with a vivid screen sense and a realistic stereo impression. Meanwhile the view synthesis has become an important issue for 3DV application. Differently from the conventional methods based on depth, in this paper we propose a new view synthesis algorithm, which can employ the correlation among views and warp in the image domain only. There are mainly two contributions. One is the incorporation of sobel edge points into feature extraction and matching, which can obtain a better stable homography and then a visual comfortable synthesis view compared to SIFT points only. The other is a novel image blending method proposed to obtain a better synthesis image. Experimental results demonstrate that the proposed method can improve the synthesis quality both in subjectivity and objectivity.

Flowing Water Editing and Synthesis Based on a Dynamic Texture Model

  • Zhang, Qian;Lee, Ki-Jung;WhangBo, Taeg-Keun
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.729-736
    • /
    • 2008
  • Using video synthesis to depict flowing water is useful in virtual reality, computer games, digital movies and scientific computing. This paper presents a novel algorithm for synthesizing dynamic water scenes through a sample video based on a dynamic texture model. In the paper, we treat the video sample as a 2-D texture image. In order to obtain textons, we analyze the video sample automatically based on dynamic texture model. Then, we utilize a linear dynamic system (LDS) to describe the characteristics of each texton. Using these textons, we synthesize a new video for dynamic flowing water which is prolonged and non-fuzzy in vision. Compared with other classical methods, our method was tested to demonstrate the effectiveness and efficiency with several video samples.

  • PDF

Virtual Contamination Lane Image and Video Generation Method for the Performance Evaluation of the Lane Departure Warning System (차선 이탈 경고 시스템의 성능 검증을 위한 가상의 오염 차선 이미지 및 비디오 생성 방법)

  • Kwak, Jae-Ho;Kim, Whoi-Yul
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.24 no.6
    • /
    • pp.627-634
    • /
    • 2016
  • In this paper, an augmented video generation method to evaluate the performance of lane departure warning system is proposed. In our system, the input is a video which have road scene with general clean lane, and the content of output video is the same but the lane is synthesized with contamination image. In order to synthesize the contamination lane image, two approaches were used. One is example-based image synthesis, and the other is background-based image synthesis. Example-based image synthesis is generated in the assumption of the situation that contamination is applied to the lane, and background-based image synthesis is for the situation that the lane is erased due to aging. In this paper, a new contamination pattern generation method using Gaussian function is also proposed in order to produce contamination with various shape and size. The contamination lane video can be generated by shifting synthesized image as lane movement amount obtained empirically. Our experiment showed that the similarity between the generated contamination lane image and real lane image is over 90 %. Futhermore, we can verify the reliability of the video generated from the proposed method through the analysis of the change of lane recognition rate. In other words, the recognition rate based on the video generated from the proposed method is very similar to that of the real contamination lane video.

Study on the Video Stabilizer based on a Triplet CNN and Training Dataset Synthesis (Triplet CNN과 학습 데이터 합성 기반 비디오 안정화기 연구)

  • Yang, Byongho;Lee, Myeong-jin
    • Journal of Broadcast Engineering
    • /
    • v.25 no.3
    • /
    • pp.428-438
    • /
    • 2020
  • The jitter in the digital videos lowers the visibility and degrades the efficiency of image processing and image compressing. In this paper, we propose a video stabilizer architecture based on triplet CNN and a method of synthesizing training datasets based on video synthesis. Compared with a conventional deep-learning video stabilization method, the proposed video stabilizer can reduce wobbling distortion.

5D Light Field Synthesis from a Monocular Video (단안 비디오로부터의 5차원 라이트필드 비디오 합성)

  • Bae, Kyuho;Ivan, Andre;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.755-764
    • /
    • 2019
  • Currently commercially available light field cameras are difficult to acquire 5D light field video since it can only acquire the still images or high price of the device. In order to solve these problems, we propose a deep learning based method for synthesizing the light field video from monocular video. To solve the problem of obtaining the light field video training data, we use UnrealCV to acquire synthetic light field data by realistic rendering of 3D graphic scene and use it for training. The proposed deep running framework synthesizes the light field video with each sub-aperture image (SAI) of $9{\times}9$ from the input monocular video. The proposed network consists of a network for predicting the appearance flow from the input image converted to the luminance image, and a network for predicting the optical flow between the adjacent light field video frames obtained from the appearance flow.

Multicontents Integrated Image Animation within Synthesis for Hiqh Quality Multimodal Video (고화질 멀티 모달 영상 합성을 통한 다중 콘텐츠 통합 애니메이션 방법)

  • Jae Seung Roh;Jinbeom Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.257-269
    • /
    • 2023
  • There is currently a burgeoning demand for image synthesis from photos and videos using deep learning models. Existing video synthesis models solely extract motion information from the provided video to generate animation effects on photos. However, these synthesis models encounter challenges in achieving accurate lip synchronization with the audio and maintaining the image quality of the synthesized output. To tackle these issues, this paper introduces a novel framework based on an image animation approach. Within this framework, upon receiving a photo, a video, and audio input, it produces an output that not only retains the unique characteristics of the individuals in the photo but also synchronizes their movements with the provided video, achieving lip synchronization with the audio. Furthermore, a super-resolution model is employed to enhance the quality and resolution of the synthesized output.

View synthesis with sparse light field for 6DoF immersive video

  • Kwak, Sangwoon;Yun, Joungil;Jeong, Jun-Young;Kim, Youngwook;Ihm, Insung;Cheong, Won-Sik;Seo, Jeongil
    • ETRI Journal
    • /
    • v.44 no.1
    • /
    • pp.24-37
    • /
    • 2022
  • Virtual view synthesis, which generates novel views similar to the characteristics of actually acquired images, is an essential technical component for delivering an immersive video with realistic binocular disparity and smooth motion parallax. This is typically achieved in sequence by warping the given images to the designated viewing position, blending warped images, and filling the remaining holes. When considering 6DoF use cases with huge motion, the warping method in patch unit is more preferable than other conventional methods running in pixel unit. Regarding the prior case, the quality of synthesized image is highly relevant to the means of blending. Based on such aspect, we proposed a novel blending architecture that exploits the similarity of the directions of rays and the distribution of depth values. By further employing the proposed method, results showed that more enhanced view was synthesized compared with the well-designed synthesizers used within moving picture expert group (MPEG-I). Moreover, we explained the GPU-based implementation synthesizing and rendering views in the level of real time by considering the applicability for immersive video service.

View Synthesis Error Removal for Comfortable 3D Video Systems (편안한 3차원 비디오 시스템을 위한 영상 합성 오류 제거)

  • Lee, Cheon;Ho, Yo-Sung
    • Smart Media Journal
    • /
    • v.1 no.3
    • /
    • pp.36-42
    • /
    • 2012
  • Recently, the smart applications, such as smart phone and smart TV, become a hot issue in IT consumer markets. In particular, the smart TV provides 3D video services, hence efficient coding methods for 3D video data are required. Three-dimensional (3D) video involves stereoscopic or multi-view images to provide depth experience through 3D display systems. Binocular cues are perceived by rendering proper viewpoint images obtained at slightly different view angles. Since the number of viewpoints of the multi-view video is limited, 3D display devices should generate arbitrary viewpoint images using available adjacent view images. In this paper, after we explain a view synthesis method briefly, we propose a new algorithm to compensate view synthesis errors around object boundaries. We describe a 3D warping technique exploiting the depth map for viewpoint shifting and a hole filling method using multi-view images. Then, we propose an algorithm to remove boundary noises that are generated due to mismatches of object edges in the color and depth images. The proposed method reduces annoying boundary noises near object edges by replacing erroneous textures with alternative textures from the other reference image. Using the proposed method, we can generate perceptually inproved images for 3D video systems.

  • PDF

Robust HDR Video Synthesis Using Illumination Invariant Descriptor (밝기 변화에 강인한 특징 기술자를 이용한 고품질 HDR 동영상 합성)

  • Vo Van, Tu;Lee, Chul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.06a
    • /
    • pp.83-84
    • /
    • 2017
  • We propose a novel high dynamic range (HDR) video synthesis algorithm from alternatively exposed low dynamic range (LDR) videos. We first estimate correspondences between input fames using an illumination invariant descriptor. Then, we synthesize an HDR frame with the weights computed to maximize detail preservation in the output HDR frame. Experimental results demonstrate that the proposed algorithm provides high-quality HDR videos without noticeable artifacts.

  • PDF