• Title/Summary/Keyword: Patch image

Search Result 223, Processing Time 0.023 seconds

ISFRNet: A Deep Three-stage Identity and Structure Feature Refinement Network for Facial Image Inpainting

  • Yan Wang;Jitae Shin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.881-895
    • /
    • 2023
  • Modern image inpainting techniques based on deep learning have achieved remarkable performance, and more and more people are working on repairing more complex and larger missing areas, although this is still challenging, especially for facial image inpainting. For a face image with a huge missing area, there are very few valid pixels available; however, people have an ability to imagine the complete picture in their mind according to their subjective will. It is important to simulate this capability while maintaining the identity features of the face as much as possible. To achieve this goal, we propose a three-stage network model, which we refer to as the identity and structure feature refinement network (ISFRNet). ISFRNet is based on 1) a pre-trained pSp-styleGAN model that generates an extremely realistic face image with rich structural features; 2) a shallow structured network with a small receptive field; and 3) a modified U-net with two encoders and a decoder, which has a large receptive field. We choose structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), L1 Loss and learned perceptual image patch similarity (LPIPS) to evaluate our model. When the missing region is 20%-40%, the above four metric scores of our model are 28.12, 0.942, 0.015 and 0.090, respectively. When the lost area is between 40% and 60%, the metric scores are 23.31, 0.840, 0.053 and 0.177, respectively. Our inpainting network not only guarantees excellent face identity feature recovery but also exhibits state-of-the-art performance compared to other multi-stage refinement models.

Accelerating Self-Similarity-Based Image Super-Resolution Using OpenCL

  • Jun, Jae-Hee;Choi, Ji-Hoon;Lee, Dae-Yeol;Jeong, Seyoon;Cho, Suk-Hee;Kim, Hui-Yong;Kim, Jong-Ok
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.1
    • /
    • pp.10-15
    • /
    • 2015
  • This paper proposes the parallel implementation of a self-similarity based image SR (super-resolution) algorithm using OpenCL. The SR algorithm requires tremendous computations to search for a similar patch. This becomes a bottleneck for the real-time conversion from a FHD image to UHD. Therefore, it is imperative to accelerate the processing speed of SR algorithms. For parallelization, the SR process is divided into several kernels, and memory optimization is performed. In addition, two GPUs are used for further acceleration. The experimental results shows that a GPGPU implementation can speed up over 140 times compared to a single-core CPU. Furthermore, it was confirmed experimentally that utilizing two GPUs can speed up the execution time proportionally, up to 277 times.

Energy Minimization Based Semantic Video Object Extraction

  • Kim, Dong-Hyun;Choi, Sung-Hwan;Kim, Bong-Joe;Shin, Hyung-Chul;Sohn, Kwang-Hoon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.138-141
    • /
    • 2010
  • In this paper, we propose a semi-automatic method for semantic video object extraction which extracts meaningful objects from an input sequence with one correctly segmented training image. Given one correctly segmented image acquired by the user's interaction in the first frame, the proposed method automatically segments and tracks the objects in the following frames. We formulate the semantic object extraction procedure as an energy minimization problem at the fragment level instead of pixel level. The proposed energy function consists of two terms: data term and smoothness term. The data term is computed by considering patch similarity, color, and motion information. Then, the smoothness term is introduced to enforce the spatial continuity. Finally, iterated conditional modes (ICM) optimization is used to minimize energy function in a globally optimal manner. The proposed semantic video object extraction method provides faithful results for various types of image sequences.

  • PDF

A Study on the Optimization of color in Digital Printing (디지털 인쇄에 있어서 컬러의 최적화에 관한 연구)

  • Kim, Jae-Hae;Lee, Sung-Hyung;Cho, Ga-Ram;Koo, Chul-Whoi
    • Journal of the Korean Graphic Arts Communication Society
    • /
    • v.26 no.1
    • /
    • pp.51-64
    • /
    • 2008
  • In this paper, an experiment was done where the input(scanner, digital still camera) and monitor(CRT, LCD) device used the linear multiple regression and the GOG (Gain-Offset-Gamma) characterization model to perform a color transformation. Also to color conversion method of the digital printer it used the LUT(Look Up Table), 3dimension linear interpolation and a tetrahedron interpolation method. The results are as follows. From color reappearance of digital printing case of monitor, the XYZ which it converts in linear multiple regression of input device it multiplied the inverse matrix, and then it applies the inverse GOG model and after color converting the patch of the result most which showed color difference below 5 at monitor RGB value. Also, The XYZ which is transmitted from the case input device which is a printer it makes at LAB value to convert an extreme, when the LAB value which is converted calculating the CMY with the LUT and tetrahedral interpolations the color conversion which considers the black quantity was more accurate.

  • PDF

Remote Sensing of Nearshore Currents using Coastal Optical Imagery (해안 광학영상 자료를 이용한 쇄파지역 연안류 측정기술)

  • Yoo, Jeseon;Kim, Sun-Sin
    • Ocean and Polar Research
    • /
    • v.37 no.1
    • /
    • pp.11-22
    • /
    • 2015
  • In-situ measurements are labor-intensive, time-consuming, and limited in their ability to observe currents with spatial variations in the surf zone. This paper proposes an optical image-based method of measurement of currents in the surf zone. This method measures nearshore currents by tracking in time wave breaking-induced foam patches from sequential images. Foam patches in images tend to be arrayed with irregular pixel intensity values, which are likely to remain consistent for a short period of time. This irregular intensity feature of a foam patch is characterized and represented as a keypoint using an image-based object recognition method, i.e., Scale Invariant Feature Transform (SIFT). The keypoints identified by the SIFT method are traced from time sequential images to produce instantaneous velocity fields. In order to remove erroneous velocities, the instantaneous velocity fields are filtered by binding them within upper and lower limits, and averaging the velocity data in time and space with a certain interval. The measurements that are obtained by this method are comparable to the results estimated by an existing image-based method of observing currents, named the Optical Current Meter (OCM).

Hole-Filling Methods Using Depth and Color Information for Generating Multiview Images

  • Nam, Seung-Woo;Jang, Kyung-Ho;Ban, Yun-Ji;Kim, Hye-Sun;Chien, Sung-Il
    • ETRI Journal
    • /
    • v.38 no.5
    • /
    • pp.996-1007
    • /
    • 2016
  • This paper presents new hole-filling methods for generating multiview images by using depth image based rendering (DIBR). Holes appear in a depth image captured from 3D sensors and in the multiview images rendered by DIBR. The holes are often found around the background regions of the images because the background is prone to occlusions by the foreground objects. Background-oriented priority and gradient-oriented priority are also introduced to find the order of hole-filling after the DIBR process. In addition, to obtain a sample to fill the hole region, we propose the fusing of depth and color information to obtain a weighted sum of two patches for the depth (or rendered depth) images and a new distance measure to find the best-matched patch for the rendered color images. The conventional method produces jagged edges and a blurry phenomenon in the final results, whereas the proposed method can minimize them, which is quite important for high fidelity in stereo imaging. The experimental results show that, by reducing these errors, the proposed methods can significantly improve the hole-filling quality in the multiview images generated.

Hole Filling Algorithm for a Virtual-viewpoint Image by Using a Modified Exemplar Based In-painting

  • Ko, Min Soo;Yoo, Jisang
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.4
    • /
    • pp.1003-1011
    • /
    • 2016
  • In this paper, a new algorithm by using 3D warping technique to effectively fill holes that are produced when creating a virtual-viewpoint image is proposed. A hole is defined as the region that cannot be seen in the reference view when a virtual view is created. In the proposed algorithm, to reduce the blurring effect that occurs on the hole region filled by conventional algorithms and to enhance the texture quality of the generated virtual view, Exemplar Based In-painting algorithm is used. The boundary noise which occurs in the initial virtual view obtained by 3D warping is also removed. After 3D warping, we estimate the relative location of the background to the holes and then pixels adjacent to the background are filled in priority to get better result by not using only adjacent object's information. Also, the temporal inconsistency between frames can be reduced by expanding the search region up to the previous frame when searching for most similar patch. The superiority of the proposed algorithm compared to the existing algorithms can be shown through the experimental results.

Object Contour Tracking using Snake in Stereo Image Sequences (스테레오 영상 시퀀스에서 스네이크를 이용한 객체 윤곽 추적 알고리즘)

  • Shin-Hyoung Kim;Jong-Whan Jang
    • The Journal of Engineering Research
    • /
    • v.6 no.2
    • /
    • pp.109-117
    • /
    • 2004
  • In this paper, we propose an object contour tracking algorithm using snakes in stereo image sequences. The proposed technique is composed of two steps. In the first step, the candidate Snake points are determined from the motion information in 3-D disparity space. In the second step, the energy of Snake function is calculated to check whether the candidate Snake points converge to the edges of the interested objects. The energy of Snake function is calculated from the candidate Snake points using the disparity information obtained by patch matching. The performance of the proposed technique is evaluated by applying it to various sample images. Results prove that the proposed technique can track the edges of objects of interest in the stereo image sequences even in the cases of complicated background images or additive components.

  • PDF

View synthesis with sparse light field for 6DoF immersive video

  • Kwak, Sangwoon;Yun, Joungil;Jeong, Jun-Young;Kim, Youngwook;Ihm, Insung;Cheong, Won-Sik;Seo, Jeongil
    • ETRI Journal
    • /
    • v.44 no.1
    • /
    • pp.24-37
    • /
    • 2022
  • Virtual view synthesis, which generates novel views similar to the characteristics of actually acquired images, is an essential technical component for delivering an immersive video with realistic binocular disparity and smooth motion parallax. This is typically achieved in sequence by warping the given images to the designated viewing position, blending warped images, and filling the remaining holes. When considering 6DoF use cases with huge motion, the warping method in patch unit is more preferable than other conventional methods running in pixel unit. Regarding the prior case, the quality of synthesized image is highly relevant to the means of blending. Based on such aspect, we proposed a novel blending architecture that exploits the similarity of the directions of rays and the distribution of depth values. By further employing the proposed method, results showed that more enhanced view was synthesized compared with the well-designed synthesizers used within moving picture expert group (MPEG-I). Moreover, we explained the GPU-based implementation synthesizing and rendering views in the level of real time by considering the applicability for immersive video service.

Chest CT Image Patch-Based CNN Classification and Visualization for Predicting Recurrence of Non-Small Cell Lung Cancer Patients (비소세포폐암 환자의 재발 예측을 위한 흉부 CT 영상 패치 기반 CNN 분류 및 시각화)

  • Ma, Serie;Ahn, Gahee;Hong, Helen
    • Journal of the Korea Computer Graphics Society
    • /
    • v.28 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Non-small cell lung cancer (NSCLC) accounts for a high proportion of 85% among all lung cancer and has a significantly higher mortality rate (22.7%) compared to other cancers. Therefore, it is very important to predict the prognosis after surgery in patients with non-small cell lung cancer. In this study, the types of preoperative chest CT image patches for non-small cell lung cancer patients with tumor as a region of interest are diversified into five types according to tumor-related information, and performance of single classifier model, ensemble classifier model with soft-voting method, and ensemble classifier model using 3 input channels for combination of three different patches using pre-trained ResNet and EfficientNet CNN networks are analyzed through misclassification cases and Grad-CAM visualization. As a result of the experiment, the ResNet152 single model and the EfficientNet-b7 single model trained on the peritumoral patch showed accuracy of 87.93% and 81.03%, respectively. In addition, ResNet152 ensemble model using the image, peritumoral, and shape-focused intratumoral patches which were placed in each input channels showed stable performance with an accuracy of 87.93%. Also, EfficientNet-b7 ensemble classifier model with soft-voting method using the image and peritumoral patches showed accuracy of 84.48%.