• Title/Summary/Keyword: Multi-images

Search Result 1,985, Processing Time 0.032 seconds

Multi-resolution Fusion Network for Human Pose Estimation in Low-resolution Images

  • Kim, Boeun;Choo, YeonSeung;Jeong, Hea In;Kim, Chung-Il;Shin, Saim;Kim, Jungho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2328-2344
    • /
    • 2022
  • 2D human pose estimation still faces difficulty in low-resolution images. Most existing top-down approaches scale up the target human bonding box images to the large size and insert the scaled image into the network. Due to up-sampling, artifacts occur in the low-resolution target images, and the degraded images adversely affect the accurate estimation of the joint positions. To address this issue, we propose a multi-resolution input feature fusion network for human pose estimation. Specifically, the bounding box image of the target human is rescaled to multiple input images of various sizes, and the features extracted from the multiple images are fused in the network. Moreover, we introduce a guiding channel which induces the multi-resolution input features to alternatively affect the network according to the resolution of the target image. We conduct experiments on MS COCO dataset which is a representative dataset for 2D human pose estimation, where our method achieves superior performance compared to the strong baseline HRNet and the previous state-of-the-art methods.

A Method for Surface Reconstruction and Synthesizing Intermediate Images for Multi-viewpoint 3-D Displays

  • Fujii, Mahito;Ito, Takayuki;Miyake, Sei
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1996.06b
    • /
    • pp.35-40
    • /
    • 1996
  • In this paper, a method for 3-D surface reconstruction with two real cameras is presented. The method, which combines the extraction of binocular disparity and its interpolation can be applied to the synthesis of images from virtual viewpoints. The synthesized virtual images are as natural as the real images even when we observe the images as stereoscopic images. The method opens up many applications, such as synthesizing input images for multi-viewpoint 3-D displays, enhancing the depth impression in 2-D images and so on. We also have developed a video-rate stereo machine able to obtain binocular disparity in 1/30 sec with two cameras. We show the performance of the machine.

  • PDF

Synthesis of Multi-View Images Based on a Convergence Camera Model

  • Choi, Hyun-Jun
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.2
    • /
    • pp.197-200
    • /
    • 2011
  • In this paper, we propose a multi-view stereoscopic image synthesis algorithm for 3DTV system using depth information with an RGB texture from a depth camera. The proposed algorithm synthesizes multi-view images which a virtual convergence camera model could generate. Experimental results showed that the performance of the proposed algorithm is better than those of conventional methods.

Multi-Detector Row CT of the Central Airway Disease (Multi-Detector Row CT를 이용한 중심부 기도 질환의 평가)

  • Kang, Eun-Young
    • Tuberculosis and Respiratory Diseases
    • /
    • v.55 no.3
    • /
    • pp.239-249
    • /
    • 2003
  • Multi-detector row CT (MDCT) provides faster speed, longer coverage in conjunction with thin slices, improved spatial resolution, and ability to produce high quality muliplanar and three-dimensional (3D) images. MDCT has revolutionized the non-invasive evaluation of the central airways. Simultaneous display of axial, multiplanar, and 3D images raises precision and accuracy of the radiologic diagnosis of central airway disease. This article introduces central airway imaging with MDCT emphasizing on the emerging role of multiplanar and 3D reconstruction.

Digital Watermarking for Multi-Level Data Hiding to Color Images (컬러 영상에서 다중-레벨 데이터 은닉을 위한 디지털 워터마킹)

  • Seo, Jung-Hee;Park, Hung-Bog
    • The KIPS Transactions:PartB
    • /
    • v.14B no.5
    • /
    • pp.337-342
    • /
    • 2007
  • Multi-level has advantage to express image in all levels with different images. This paper proposes digital watermarking built-in technique to transform color image to YCbCr color space to guarantee robustness and imperceptibility of the watermark in the various expression of color images, and to hide multi-level data which shows spread spectrum from low resolution to whole resolution for the Y-signal of multi-level. In color signal, Y-signal and low resolution built-in watermark has risk to be visible, but it can guarantee the robustness of watermark in various colors and transformed images. As a result of the experiment, wavelet compression image with built-in watermark showed robustness and imperceptibility of watermark.

Bilayer Segmentation of Consistent Scene Images by Propagation of Multi-level Cues with Adaptive Confidence (다중 단계 신호의 적응적 전파를 통한 동일 장면 영상의 이원 영역화)

  • Lee, Soo-Chahn;Yun, Il-Dong;Lee, Sang-Uk
    • Journal of Broadcast Engineering
    • /
    • v.14 no.4
    • /
    • pp.450-462
    • /
    • 2009
  • So far, many methods for segmenting single images or video have been proposed, but few methods have dealt with multiple images with analogous content. These images, which we term consistent scene images, include concurrent images of a scene and gathered images of a similar foreground, and may be collectively utilized to describe a scene or as input images for multi-view stereo. In this paper, we present a method to segment these images with minimum user input, specifically, manual segmentation of one image, by iteratively propagating information via multi-level cues with adaptive confidence depending on the nature of the images. Propagated cues are used as the bases to compute multi-level potentials in an MRF framework, and segmentation is done by energy minimization. Both cues and potentials are classified as low-, mid-, and high- levels based on whether they pertain to pixels, patches, and shapes. A major aspect of our approach is utilizing mid-level cues to compute low- and mid- level potentials, and high-level cues to compute low-, mid-, and high- level potentials, thereby making use of inherent information. Through this process, the proposed method attempts to maximize the amount of both extracted and utilized information in order to maximize the consistency of the segmentation. We demonstrate the effectiveness of the proposed method on several sets of consistent scene images and provide a comparison with results based only on mid-level cues [1].

Evaluation of a multi-stage convolutional neural network-based fully automated landmark identification system using cone-beam computed tomography-synthesized posteroanterior cephalometric images

  • Kim, Min-Jung;Liu, Yi;Oh, Song Hee;Ahn, Hyo-Won;Kim, Seong-Hun;Nelson, Gerald
    • The korean journal of orthodontics
    • /
    • v.51 no.2
    • /
    • pp.77-85
    • /
    • 2021
  • Objective: To evaluate the accuracy of a multi-stage convolutional neural network (CNN) model-based automated identification system for posteroanterior (PA) cephalometric landmarks. Methods: The multi-stage CNN model was implemented with a personal computer. A total of 430 PA-cephalograms synthesized from cone-beam computed tomography scans (CBCT-PA) were selected as samples. Twenty-three landmarks used for Tweemac analysis were manually identified on all CBCT-PA images by a single examiner. Intra-examiner reproducibility was confirmed by repeating the identification on 85 randomly selected images, which were subsequently set as test data, with a two-week interval before training. For initial learning stage of the multi-stage CNN model, the data from 345 of 430 CBCT-PA images were used, after which the multi-stage CNN model was tested with previous 85 images. The first manual identification on these 85 images was set as a truth ground. The mean radial error (MRE) and successful detection rate (SDR) were calculated to evaluate the errors in manual identification and artificial intelligence (AI) prediction. Results: The AI showed an average MRE of 2.23 ± 2.02 mm with an SDR of 60.88% for errors of 2 mm or lower. However, in a comparison of the repetitive task, the AI predicted landmarks at the same position, while the MRE for the repeated manual identification was 1.31 ± 0.94 mm. Conclusions: Automated identification for CBCT-synthesized PA cephalometric landmarks did not sufficiently achieve the clinically favorable error range of less than 2 mm. However, AI landmark identification on PA cephalograms showed better consistency than manual identification.

Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion

  • Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.427-438
    • /
    • 2023
  • Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.

Evaluation of Pulmonary Nodules Finer on Energy Subtraction X-ray Images (에너지 차분 흉부 X선 화상으로부터 폐종류 음영 검출 필터의 평가)

  • 김응규;이충호;권영도
    • Proceedings of the IEEK Conference
    • /
    • 2000.11d
    • /
    • pp.61-64
    • /
    • 2000
  • The purpose of this study is prove the effectiveness of an energy subtraction image for the detection of pulmonary nodules and the effectiveness of multi-resolutional filter on an energy subtraction image to detect pulmonary nodules. Also we examine influential factors to the accuracy of detection of pulmonary nodules from viewpoints of types of images and evaluation methods. As one type of images, we select energy subtraction X-ray images, at the same time is done ▽$^2$G filter and multi-resolutional filter. Here select two evaluation methods and make clear the effectiveness of multi-resolutional filter on an energy subtraction image.

  • PDF

LAND COVER CLASSIFICATION BY USING SAR COHERENCE IMAGES

  • Yoon, Bo-Yeol;Kim, Youn-Soo
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.76-79
    • /
    • 2008
  • This study presents the use of multi-temporal JERS-1 SAR images to the land cover classification. So far, land cover classified by high resolution aerial photo and field survey and so on. The study site was located in Non-san area. This study developed on multi-temporal land cover status monitoring and coherence information mapping can be processing by L band SAR image. From July, 1997 to October, 1998 JERS SAR images (9 scenes) coherence values are analyzed and then classified land cover. This technique which forms the basis of what is called SAR Interferometry or InSAR for short has also been employed in spaceborne systems. In such systems the separation of the antennas, called the baseline is obtained by utilizing a single antenna in a repeat pass

  • PDF