Search | Korea Science

Efficient Pruning Cluster Graph Strategy for MPEG Immersive Video Compression (프루닝 클러스터 그래프 구성 전략에 따른 몰입형 비디오 압축 성능 분석)

Lee, Soonbin;Jeong, Jong-Beom;Ryu, Eun-Seok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.101-104
- /
- 2022
MPEG Immersive Video (MIV) 표준화 기술은 다시점 영상 부호화 시 비디오 코덱의 부담을 최소화하기 위해 각 시점 영상의 차분 정보만을 표현하는 처리 기술을 바탕으로 하고 있다. 본 논문에서는 시점 간 중복성 제거를 진행하는 과정인 프루닝(pruning) 과정에서 복잡도 절감을 위해 병렬처리에 용이하도록 구성되는 프루닝 클러스터 그래프에 대해 서술하고, 각 클러스터 그래프 별 구성 전략에 따른 성능 분석을 진행한다. 클러스터 그래프 내에서 중복성 제거를 진행하지 않고 완전한 정보를 보존하는 바탕 시점(basic view)의 개수가 적게 포함될수록 처리할 전체 픽셀 화소율(pixel rate)은 감소하지만, 복원 화질 역시 감소하며 프루닝 복잡도는 증가하는 경향을 보인다. 실험 결과를 통해 프루닝 클러스터 그래프 구성에 따른 트레이드오프를 탐색하고, 최적화된 그래프 구성 전략에 따라 몰입형 비디오의 효율적인 전송이 가능함을 보인다.
PDF

Parallax Distortion Detection and Correction Method for Video Stitching by using LDPM Image Assessment (LDPM 영상 평가를 활용한 동영상 스티칭의 시차 왜곡 검출 및 정정 방법)

Rhee, Seongbae;Kang, Jeonho;Kim, Kyuheon
- Journal of Broadcast Engineering
- /
- v.25 no.5
- /
- pp.685-697
- /
- 2020
Immersive media videos, such as panorama and 360-degree videos, must provide a sense of realism as if the user visited the space in the video, so they should be able to represent the reality of the real world. However, in panorama and 360-degree videos, objects appear to overlap or disappear due to parallax between cameras, and such parallax distortion may interfere with immersion of the user's content. Accordingly, although many video stitching algorithms have been proposed to overcome parallax distortion, parallax distortion still occurs due to the low performance of the Object detection module and limitations of the Seam generation method. Therefore, this paper analyzes the limitations of the existing video stitching technology and proposes a method for detecting and correcting parallax distortion of video stitching using the LDPM (Local Differential Pixel Mean) image evaluation method that overcomes the limitations of the video stitching technique.
https://doi.org/10.5909/JBE.2020.25.5.685 인용 PDF KSCI KPUBS

Performance Analysis on View Synthesis of 360 Videos for Omnidirectional 6DoF in MPEG-I (MPEG-I의 6DoF를 위한 360 비디오 가상시점 합성 성능 분석)

Kim, Hyun-Ho;Kim, Jae-Gon
- Journal of Broadcast Engineering
- /
- v.24 no.2
- /
- pp.273-280
- /
- 2019
360 video is attracting attention as immersive media with the spread of VR applications, and MPEG-I (Immersive) Visual group is actively working on standardization to support immersive media experiences with up to six degree of freedom (6DoF). In virtual space of omnidirectional 6DoF, which is defined as a case of degree of freedom providing 6DoF in a restricted area, looking at the scene at any viewpoint of any position in the space requires rendering the view by synthesizing additional viewpoints called virtual omnidirectional viewpoints. This paper presents the performance results on view synthesis and their analysis, which have been done as exploration experiments (EEs) of omnidirectional 6DoF in MPEG-I. In other words, experiment results on view synthesis in various aspects of synthesis conditions such as the distances between input views and virtual view to be synthesized and the number of input views to be selected from the given set of 360 videos providing omnidirectional 6DoF are presented.
https://doi.org/10.5909/JBE.2019.24.2.273 인용 PDF KSCI KPUBS HTML

Trends in the Telepresence Technologies and Services Beyond 5G (5G를 향한 영상회의 기술 및 서비스 동향)

Lee, H.K.;Han, M.K.;Jang, J.H.
- Electronics and Telecommunications Trends
- /
- v.32 no.5
- /
- pp.20-29
- /
- 2017
In this paper, a video conferencing system, which has been attracting significant attention as an immersive telepresence service owing to the recent emergence of 5G networks, is described. We propose a service platform for Giga Media based video conferencing for 5G convergence services. The video conferencing service consists of a traditional structure that provides information exchange through the transmission of video and voice of remote participants using a Multipoint Control Unit (MCU), a browser-based video conferencing based on WebRTC, a multi-view point video conferencing, and holographic telepresence centered on mixed reality. The paper introduces the trends and detailed structures of various technologies used in the video conferencing system, and compares the video conferencing system technologies and video conferencing service characteristics for the integrated Giga Media platform.
https://doi.org/10.22648/ETRI.2017.J.320503 인용 PDF

Performance Analysis of VVC In-Loop Filters for Immersive Video Coding (몰입형 입체영상 부호화를 위한 VVC 인루프 필터 성능 분석)

Yongho Choi;Gun Bang;Jinho Lee;Jin Young Lee
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.11a
- /
- pp.151-153
- /
- 2022
최근 Moving Picture Experts Group(MPEG)에서는 2차원 비디오 압축 표준인 Versatile Video Coding(VVC)에 이어서 다양한 영상 포맷들에 대한 압축 방식을 표준화하고 있다. 특히, 가상현실, 증강현실, 혼합현실 등의 지원을 위한 Six Degrees of Freedom(6DoF) 입체영상 콘텐츠들이 최근 다양한 분야들에서 활용되고 있는데, 6DoF 입체영상은 일반적으로 복수 시점의 고해상도 칼라영상과 깊이영상으로 구성된다. 이러한 고해상도의 6DoF 몰입형 입체영상을 제한된 네트워크 환경에서 완벽한 서비스를 목표로 MPEG에서는 몰입형 입체영상 압축 기술인 MPEG Immersive Video(MIV) 표준화를 활발하게 진행 중에 있다. MIV에서는 기본 뷰(Basic View)로 이루어진 영상과 추가 뷰(Addtional View)에서 중복성 높은 픽셀들이 제거된 아틀라스 패치로 이루어진 영상을 각각 VVC로 압축한다. 하지만 아틀라스 패치로 이루어진 영상의 경우에는 일반적인 2차원 칼라영상과 다른 특성을 가지기 때문에, VVC 인루프 필터 기술이 비효율적일 수 있다. 따라서, 본 논문에서는 MIV 표준에서의 VVC 인루프 필터들의 성능을 분석한다.
PDF

Spatial Audio Technologies for Immersive Media Services (체감형 미디어 서비스를 위한 공간음향 기술 동향)

Lee, Y.J.;Yoo, J.;Jang, D.;Lee, M.;Lee, T.
- Electronics and Telecommunications Trends
- /
- v.34 no.3
- /
- pp.13-22
- /
- 2019
Although virtual reality technology may not be deemed as having a satisfactory quality for all users, it tends to incite interest because of the expectation that the technology can allow one to experience something that they may never experience in real life. The most important aspect of this indirect experience is the provision of immersive 3D audio and video, which interacts naturally with every action of the user. The immersive audio faithfully reproduces an acoustic scene in a space corresponding to the position and movement of the listener, and this technology is also called spatial audio. In this paper, we briefly introduce the trend of spatial audio technology in view of acquisition, analysis, reproduction, and the concept of MPEG-I audio standard technology, which is being promoted for spatial audio services.
https://doi.org/10.22648/ETRI.2019.J.340302 인용 PDF HTML

Understanding the User Preferences in the Types of Video Censorship

Park, Sohyeon;Kim, Kyulee;Oh, Uran
- International Journal of Internet, Broadcasting and Communication
- /
- v.14 no.2
- /
- pp.147-161
- /
- 2022
Video on demand (VOD) platforms provide immersive, inspiring, and commercial-free binge watching experiences. Recently, the number of these platform users increased dramatically as users can enjoy various contents without physical and time constraints during COVID-19. However, such platforms do not provide sufficient video censorship services while there is a strong need. In this study, we investigated the users' desire for video censorship when choosing and watching movies on VOD platforms, and how video censorship can be applied to different types of scenes to increase the censoring effect without diminishing the enjoyment. We first conducted an online survey with 98 respondents to identify the types of discomfort while watching sexual, violent, or drug-related scenes. We then conducted an in-depth online interview with 18 participants to identify the effective video filtering types and regions for each of the three scenes. Based on the findings, we suggest implications for designing a censor application for videos that contain uncomfortable scenes.
https://doi.org/10.7236/IJIBC.2022.14.2.147 인용 PDF KSCI

Visual Object Tracking Fusing CNN and Color Histogram based Tracker and Depth Estimation for Automatic Immersive Audio Mixing

Park, Sung-Jun;Islam, Md. Mahbubul;Baek, Joong-Hwan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.3
- /
- pp.1121-1141
- /
- 2020
We propose a robust visual object tracking algorithm fusing a convolutional neural network tracker trained offline from a large number of video repositories and a color histogram based tracker to track objects for mixing immersive audio. Our algorithm addresses the problem of occlusion and large movements of the CNN based GOTURN generic object tracker. The key idea is the offline training of a binary classifier with the color histogram similarity values estimated via both trackers used in this method to opt appropriate tracker for target tracking and update both trackers with the predicted bounding box position of the target to continue tracking. Furthermore, a histogram similarity constraint is applied before updating the trackers to maximize the tracking accuracy. Finally, we compute the depth(z) of the target object by one of the prominent unsupervised monocular depth estimation algorithms to ensure the necessary 3D position of the tracked object to mix the immersive audio into that object. Our proposed algorithm demonstrates about 2% improved accuracy over the outperforming GOTURN algorithm in the existing VOT2014 tracking benchmark. Additionally, our tracker also works well to track multiple objects utilizing the concept of single object tracker but no demonstrations on any MOT benchmark.
https://doi.org/10.3837/tiis.2020.03.012 인용 PDF KSCI HTML

Interactive Virtual Studio & Immersive Viewer Environment (인터렉티브 가상 스튜디오와 몰입형 시청자 환경)

김래현;박문호;고희동;변혜란
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 1999.06b
- /
- pp.87-93
- /
- 1999
In this paper, we introduce a novel virtual studio environment where a broadcaster in the virtual set interacts with tele-viewers as if they are sharing the same environment as participants. A tele-viewer participates physically in the virtual studio environment by a dummy-head equipped with video "eyes" and microphone "ears" physically located in the studio. The dummy head as a surrogate of the tole-viewer follows the tele-viewer's head movements and views and hears through the dummy head like a tele-operated robot. By introducing the tele-presence technology in the virtual studio setting, the broadcaster can not only interact with the virtual set elements like the regular virtual studio environment but also share the physical studio with the surrogates of the tele-viewers as participants. The tele-viewer may see the real broadcaster in the virtual set environment and other participants as avatars in place of their respective dummy heads. With an immersive display like HMD, the tele-viewer may look around the studio and interact with other avatars. The new interactive virtual studio with the immersive viewer environment may be applied to immersive tele-conferencing, tele-teaching, and interactive TV program productions.program productions.
PDF

Evaluation of Video Codec AI-based Multiple tasks (인공지능 기반 멀티태스크를 위한 비디오 코덱의 성능평가 방법)

Kim, Shin;Lee, Yegi;Yoon, Kyoungro;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
- Journal of Broadcast Engineering
- /
- v.27 no.3
- /
- pp.273-282
- /
- 2022
MPEG-VCM(Video Coding for Machine) aims to standardize video codec for machines. VCM provides data sets and anchors, which provide reference data for comparison, for several machine vision tasks including object detection, object segmentation, and object tracking. The evaluation template can be used to compare compression and machine vision task performance between anchor data and various proposed video codecs. However, performance comparison is carried out separately for each machine vision task, and information related to performance evaluation of multiple machine vision tasks on a single bitstream is not provided currently. In this paper, we propose a performance evaluation method of a video codec for AI-based multi-tasks. Based on bits per pixel (BPP), which is the measure of a single bitstream size, and mean average precision(mAP), which is the accuracy measure of each task, we define three criteria for multi-task performance evaluation such as arithmetic average, weighted average, and harmonic average, and to calculate the multi-tasks performance results based on the mAP values. In addition, as the dynamic range of mAP may very different from task to task, performance results for multi-tasks are calculated and evaluated based on the normalized mAP in order to prevent a problem that would be happened because of the dynamic range.
https://doi.org/10.5909/JBE.2022.27.3.273 인용 PDF KSCI KPUBS

Search Result 129, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)