Search | Korea Science

A Feature Map Generation Method for MSFC-Based Feature Compression without Min-Max Signaling in VCM (VCM 의 MSFC 기반 특징 압축을 위한 Min-Max 시그널링을 제외한 특징맵 생성 기법)

Dong-Ha Kim;Yong-Uk Yoon;Jae-Gon Kim
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.11a
- /
- pp.79-81
- /
- 2022
MPEG-VCM(Video Coding for Machines)에서는 머신비전(machine vision) 네트워크의 백본(backbone)에서 추출된 이미지/비디오 특징 압축을 위한 표준화를 진행하고 있다. 현재 VCM 표준기술 탐색 과정에서 가장 좋은 압축 성능을 보이는 MSFC(Multi-Scale Feature compression) 기반 압축 네트워크 모델은 추출된 멀티-스케일 특징을 단일-스케일 특징으로 변환하여 특징맵으로 구성하고 이를 VVC 로 압축한다. 본 논문에서는 MSFC 기반 압축 모델에서 Min-Max 값 시그널링을 제외한 최소-최대(Min-Max) 정규화를 포함한 개선된 특징맵 생성 기법을 제시한다. 즉, 제안기법은 VCM 디코더에서의 특징맵 복원을 위한 Min-Max 값을 학습 기반으로 생성함으로써 Min-Max 시그널링의 비트 오버헤드 절감뿐만 아니라 별도의 시그널링 기제를 생략한 보다 단순한 전송 비트스트림 구성을 가능하게 한다. 실험결과 제안기법은 이미지 앵커(Anchor) 대비 BPP-mAP 성능에서 83.24% BD-rate 이득을 보이며, 이는 기존 MSFC 보다 1.74%정도 다소 떨어지지만 별도의 Min-Max 시그널링 없이도 기존의 성능을 유지할 수 있음을 보인다.
PDF

3D Human Shape Estimation from a Silhouette Image by using Statistical Human Shape Spaces (통계적 신체 외형 데이터베이스를 활용한 실루엣으로부터의 3차원 인체 외형 예측)

Dasol Ahn;Sang Il Park
- Journal of the Korea Computer Graphics Society
- /
- v.29 no.1
- /
- pp.13-22
- /
- 2023
In this paper, we present a method for estimating full 3D shapes from given 2D silhouette images of human bodies. Because the silhouette only consists of the partial information on the true shape, it is an ill-posed problem. To address the problem, we use the statistical human shape space obtained from the existing large 3D human shape database. The method consists of three steps. First, we extract the boundary pixels and their appropriate normal vectors from the input silhouette images. Then, we initialize the correspondences of each pixel to the vertex of the statistically-deformable 3D human model. Finally, we numerically optimize the parameters of the statistical model to fit best to the given silhouettes. The viability and the robustness of the method is demonstrated with various experiments.
https://doi.org/10.15701/kcgs.2022.29.1.13 인용 PDF

Multicontents Integrated Image Animation within Synthesis for Hiqh Quality Multimodal Video (고화질 멀티 모달 영상 합성을 통한 다중 콘텐츠 통합 애니메이션 방법)

Jae Seung Roh;Jinbeom Kang
- Journal of Intelligence and Information Systems
- /
- v.29 no.4
- /
- pp.257-269
- /
- 2023
There is currently a burgeoning demand for image synthesis from photos and videos using deep learning models. Existing video synthesis models solely extract motion information from the provided video to generate animation effects on photos. However, these synthesis models encounter challenges in achieving accurate lip synchronization with the audio and maintaining the image quality of the synthesized output. To tackle these issues, this paper introduces a novel framework based on an image animation approach. Within this framework, upon receiving a photo, a video, and audio input, it produces an output that not only retains the unique characteristics of the individuals in the photo but also synchronizes their movements with the provided video, achieving lip synchronization with the audio. Furthermore, a super-resolution model is employed to enhance the quality and resolution of the synthesized output.
https://doi.org/10.13088/jiis.2023.29.4.257 인용 PDF

3DentAI: U-Nets for 3D Oral Structure Reconstruction from Panoramic X-rays (3DentAI: 파노라마 X-ray로부터 3차원 구강구조 복원을 위한 U-Nets)

Anusree P.Sunilkumar;Seong Yong Moon;Wonsang You
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.7
- /
- pp.326-334
- /
- 2024
Extra-oral imaging techniques such as Panoramic X-rays (PXs) and Cone Beam Computed Tomography (CBCT) are the most preferred imaging modalities in dental clinics owing to its patient convenience during imaging as well as their ability to visualize entire teeth information. PXs are preferred for routine clinical treatments and CBCTs for complex surgeries and implant treatments. However, PXs are limited by the lack of third dimensional spatial information whereas CBCTs inflict high radiation exposure to patient. When a PX is already available, it is beneficial to reconstruct the 3D oral structure from the PX to avoid further expenses and radiation dose. In this paper, we propose 3DentAI - an U-Net based deep learning framework for 3D reconstruction of oral structure from a PX image. Our framework consists of three module - a reconstruction module based on attention U-Net for estimating depth from a PX image, a realignment module for aligning the predicted flattened volume to the shape of jaw using a predefined focal trough and ray data, and lastly a refinement module based on 3D U-Net for interpolating the missing information to obtain a smooth representation of oral cavity. Synthetic PXs obtained from CBCT by ray tracing and rendering were used to train the networks without the need of paired PX and CBCT datasets. Our method, trained and tested on a diverse datasets of 600 patients, achieved superior performance to GAN-based models even with low computational complexity.
https://doi.org/10.3745/TKIPS.2024.13.7.326 인용 PDF

Speckle Noise Reduction and Image Quality Improvement in U-net-based Phase Holograms in BL-ASM (BL-ASM에서 U-net 기반 위상 홀로그램의 스펙클 노이즈 감소와 이미지 품질 향상)

Oh-Seung Nam;Ki-Chul Kwon;Jong-Rae Jeong;Kwon-Yeon Lee;Nam Kim
- Korean Journal of Optics and Photonics
- /
- v.34 no.5
- /
- pp.192-201
- /
- 2023
The band-limited angular spectrum method (BL-ASM) causes aliasing errors due to spatial frequency control problems. In this paper, a sampling interval adjustment technique for phase holograms and a technique for reducing speckle noise and improving image quality using a deep-learningbased U-net model are proposed. With the proposed technique, speckle noise is reduced by first calculating the sampling factor and controlling the spatial frequency by adjusting the sampling interval so that aliasing errors can be removed in a wide range of propagation. The next step is to improve the quality of the reconstructed image by learning the phase hologram to which the deep learning model is applied. In the S/W simulation of various sample images, it was confirmed that the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) were improved by 5% and 0.14% on average, compared with the existing BL-ASM.
https://doi.org/10.3807/KJOP.2023.34.5.192 인용 PDF

Patch Information based Linear Interpolation for Generating Super-Resolution Images in a Single Image (단일이미지에서의 초해상도 영상 생성을 위한 패치 정보 기반의 선형 보간 연구)

Han, Hyun-Ho;Lee, Jong-Yong;Jung, Kye-Dong;Lee, Sang-Hun
- Journal of the Korea Convergence Society
- /
- v.9 no.6
- /
- pp.45-52
- /
- 2018
In this paper, we propose a linear interpolation method based on patch information generated from a low - resolution image for generating a super resolution image in a single image. Using the regression model of the global space, which is a conventional super resolution generation method, results in poor quality in general because of lack of information to be referred to a specific region. In order to compensate for these results, we propose a method to extract meaningful information by dividing the region into patches in the process of super resolution image generation, analyze the constituents of the image matrix region extended for super resolution image generation, We propose a method of linear interpolation based on optimal patch information that is searched by correlating patch information based on the information gathered before the interpolation process. For the experiment, the original image was compared with the reconstructed image with PSNR and SSIM.
https://doi.org/10.15207/JKCS.2018.9.6.045 인용 PDF KSCI

A Study on the Stereo Image Matching using MRF model and segmented image (MRF 모델과 분할 영상을 이용한 영상정합에 관한 연구)

변영기;한동엽;김용일
- Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
- /
- 2004.03a
- /
- pp.511-516
- /
- 2004
수치표고모델, 정사영상과 같은 공간영상정보를 구축하기 위해서는 입체영상을 이동한 영상정합(image matching)의 과정이 필수적이며, 단영상 또는 스테레오 영상을 이용하여 대상물의 3차원 정보를 재구성하고 복원하는 기술은 사진측량 및 컴퓨터 비전 분야의 주요 연구 중의 하나이다. 본 연구에서는 화소값의 유사성과 상호관계성을 고려하는 MRF 모델을 이용하여 영상정합을 수행하였다. MRF 모델은 공간분석이나 물리적 현상의 전후관계(contextural dependencies)의 분석을 위한 확률이론의 한 분야로 다양한 공간정보를 통합할 수 있는 방법을 제공한다. 본 연구에서는 기준영상의 화소에 시차를 할당하는 접근 방법으로 확률모델의 일종인 마르코프 랜덤필드(MRF)모델에 기반한 영상정합기법을 제안하였고, 공간내 화소의 상호관계를 고려해주므로 대상물의 경계부분에서의 매칭 정확도를 향상시켰다. 영상정합문제에서의 MRF 기본가정은 영상 내 특정화소의 시차는 그 주위화소의 시차에 의한 부분정보에 따라 결정이 가능하다는 것이다. 깁스분포(gibbs distribution)를 사용하여 사후(posteriori) 확률값을 유도해내고, 이를 최대사후확률(MAP: Maximum a Posteriori)추정법을 이용하여 에너지함수를 생성하였다. 생성된 에너지함수의 최적화(Optimization)를 위하여 본 연구에서는 전역최적화기법인 multiway cut 기법을 사용하여 영상정합에 있어 에너지함수를 최소로 하는 이미지화소에 대한 시차레이블을 구하여 영상정합을 수행하였다.
PDF

Costume Images of the Chosun Period′s Po for Men(Part I ) - Constituent factors, Type, Reflection of the Period - (조선시대 남자 포제에 나타난 복식이미지(제1보) -남자포제 이미지구성 요인 및 유형별, 시기별 복식이미지-)

Ju-Yeun Do;Young-Suk Kwon
- Journal of the Korean Society of Clothing and Textiles
- /
- v.25 no.10
- /
- pp.1695-1706
- /
- 2001
본 연구는 조선시대 남자 포제(포제에 나타난 복식이미지의 구성요인을 밝히고 남자포제 유형별(철릭, 답호, 직령, 도포, 창의, 주의), 시기별(전기, 중기, 후기) 복식이미지를 알아봄으로서 조선시대 남자포제가 가진 복식이미지를 밝혀 현대 전통복식 디자인에 응용될 수 있는 기초적인 자료를 제공하고자 한다. 의복 자극물은 남자 평상복을 중심으로 하여 조선초기(1477년∼1543년)의 남자 포제로는 철릭, 답호, 직령 3점과 조선중기(18세기)는 도포, 창의 2점, 조선후기(17세기 후기∼20세기 초)는 주의 1점으로 하였고, 당 시대의 정화한 복식이미지를 살펴보기 위해 유물을 복원 제작하여 사용하였다. 이것을 모델에게 착장시켜 슬라이드로 제작한 후 자극물로 제시하였다. 의미지분척도외 구성은 자유언어연상법으로 형용사를 수집하여 23쌍의 형용사쌍을 구성하였다. 패널단은 대학생 남·여 총 600명으로 하였고 자료분석은 SAS을 이용하여 요인분석 분산분석 등을 사용하였다. 1. 조선시대 남자 포제의 요인구조는 품위성 요인(25.2%), 활동성 요인(l4.2%), 관할성 요인(37.9%), 현시성 요인(6.7%), 경연성 요인(5.7%)으로 구성되었으며, 이들 5개 요인의 전체변량 62.7% 중에서 품위성 요인, 활동성 요인, 관할성 요인이 전체변량의 50%를 넘어서 이 세 요인이 남자 포제에서 기본적으로 느껴지는 중요한 요인임을 알 수 있다. 2. 조선시대 남자 포제 유형별 복식이미지의 차이를 알아본 결과, 철릭은 가장 부자연스러운, 주름있는, 곡선적인, 부드러운, 특이한 이미지의 포제로 나타났으며, 답호는 가장 절제된, 직선적인 딱딱한, 특이한 이미지로, 직령은 가장 비활동적인, 답답한, 전통적인 이미지로 도포는 가장 품위있는. 관할한 이미지로 창의는 다른 포제에 비해 평범한, 단순한, 이미지로 주의는 가장 품위 없는, 일상적인, 활동적인, 단순한, 순수한 이미지의 포제로 평가되었다. 모든 남자포제가 전통적, 순수한 이미지의 포제로, 철릭을 제외한 모든 포제가 단순한 이미지로 나타나 조선시대 남자 포제가 공통적으로 가지는 이미지는 단순하고 순수한 이미지를 가지고 있음을 알 수 있다. 3. 남자 포제의 시기별 복식이미지에서는 조선전기(철릭, 답호, 직령)의 포제는 관할성 요인이 높은 의례적인, 관할한, 특이한 이미지로 평가되었고 조선중기(도포, 창의)의 포제는 품위있는, 절제된, 풍성한 이미지로 평가되었으며, 조선후기(주의)의 포제는 활동적인, 단순한, 직선적인 이미지로 나타났다. 따라서 시대별 남자 포제의 이미지는 시대적 여건과 상황에 따라 변화되어 왔으며, 시대에 따라 추구하는 이미지가 달랐다는 것을 알 수 있다.
PDF

Off-line Handwritten Digit Recognition Using A Dynamic 3-D Neuro System (동적 3-D 뉴로 시스템을 이용한 오프라인 필기체 숫자 인식)

Kim Ki Taek;Kwon Young Chul;Lee Soo Dong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2004.11a
- /
- pp.505-508
- /
- 2004
본 논문은 동적 3-D 뉴로 시스템(A Dynamic 3-D Neuro System)모델을 이용하여 오프라인 필기체 숫자 인식 실험을 하였다. 3-D 뉴로 시스템 모델을 사용함으로써 기존에 교육된 정보를 유지하면서 새로운 정보를 추가할 수 있는 추가학습이 가능했고, 동일한 범주의 정보에 대해서는 반복교육 횟수에 따라 교육정도가 점점 누적되는 반복교육이 가능했다. 교육과정을 통해 누적된 정보로부터 일반화된 패턴(Generalized Pattern)을 도출해 인식시 사용할 수 있었다. 패턴 인식기는 피드백루틴을 통해 미지의 입력이미지를 원형이미지로 복원한 후, 그 결과 데이터를 사용하여 문자를 인식하도록 동작한다. NIST의 MNIST 데이터베이스를 사용해 실험을 하였고, 결과로 $99.0\%$의 정인식률을 얻었다.
PDF

An Efficient Walkthrough from Two Images using Spidery Mesh Interface and View Morphing (Spidery 매쉬 인터페이스와 뷰 모핑을 이용한 두 이미지로부터의 효율적인 3차원 애니메이션)

Cho, Hang-Shin;Kim, Chang-Hun
- Journal of KIISE:Computing Practices and Letters
- /
- v.7 no.2
- /
- pp.132-140
- /
- 2001
This paper proposes an efficient walktlu-ough animation from two images of the same scene. To make animation easily and fast, Tour Into the Picture(TIP) enables walkthrough animation from single image but lacks the reality of its foreground object when the viewpoint moves from side to side, and view morphing uses only 2D transition between two images but restricts its camera path on the line between two views. By combining advantages of these two image-based techniques, this paper suggests a new virtual navigation technique which enable natural scene transformation when the viewpoint changes in the side-to-side direction as well as in the depth direction. In our method, view morphing is employed only in foreground objects , and background scene which is perceived carelessly is mapped into cube-like 3D model as in TIP, so as to save laborious 3D reconstruction costs and improve visual realism simultaneously. To do this, we newly define a camera transformation between two images from the relationship of the spidery mesh transformation and its corresponding 3D view change. The result animation shows that our method creates a realistic 3D virtual navigation using a simple interface.
PDF

Search Result 80, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)