• Title/Summary/Keyword: CODEC

Search Result 694, Processing Time 0.024 seconds

Performance Analysis of Super-Resolution based Video Coding for HEVC (HEVC 기반 초해상화를 이용한 비디오 부호화 효율 성능 분석)

  • Ki, Sehwan;Kim, Dae-Eun;Jun, Ki Nam;Baek, Seung Ho;Choi, Jeung Won;Kim, Dong Hyun;Kim, Munchurl
    • Journal of Broadcast Engineering
    • /
    • v.24 no.2
    • /
    • pp.306-314
    • /
    • 2019
  • Since the resolutions of videos increase rapidly, there are continuing needs for effective video compression methods despite an increase in the transmission bandwidth. In order to satisfy such a demand, a reconstructive video coding (RVC) method by using a super resolution has been proposed. Since RVC reduces the resolution of the input video, when frames are compressed to the same size, the number of bits per pixel increases, thereby reducing coding artifacts caused by video coding. However, RVC method using super resolution is not effective in all target bitrates. Comparing the size of the loss generated while downsizing the resolution and the size of the loss caused by the video compression, only when the size of loss generated in the video compression is larger, RVC method can perform the improved compression performance compared to direct video coding. In particular, since HEVC has considerably higher compression performance than the previous standard video codec, it can be experimentally confirmed that the compression distortions become larger than the distortions of downsizing the resolution only in the very low-bitrate conditions. In this paper, we applied RVC based HEVC in various video types and measured the target bitrates that RVC method can be effectively applied.

Deep Learning Based Group Synchronization for Networked Immersive Interactions (네트워크 환경에서의 몰입형 상호작용을 위한 딥러닝 기반 그룹 동기화 기법)

  • Lee, Joong-Jae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.10
    • /
    • pp.373-380
    • /
    • 2022
  • This paper presents a deep learning based group synchronization that supports networked immersive interactions between remote users. The goal of group synchronization is to enable all participants to synchronously interact with others for increasing user presence Most previous methods focus on NTP-based clock synchronization to enhance time accuracy. Moving average filters are used to control media playout time on the synchronization server. As an example, the exponentially weighted moving average(EWMA) would be able to track and estimate accurate playout time if the changes in input data are not significant. However it needs more time to be stable for any given change over time due to codec and system loads or fluctuations in network status. To tackle this problem, this work proposes the Deep Group Synchronization(DeepGroupSync), a group synchronization based on deep learning that models important features from the data. This model consists of two Gated Recurrent Unit(GRU) layers and one fully-connected layer, which predicts an optimal playout time by utilizing the sequential playout delays. The experiments are conducted with an existing method that uses the EWMA and the proposed method that uses the DeepGroupSync. The results show that the proposed method are more robust against unpredictable or rapid network condition changes than the existing method.

SHVC-based V-PCC Content ISOBMFF Encapsulation and DASH Configuration Method (SHVC 기반 V-PCC 콘텐츠 ISOBMFF 캡슐화 및 DASH 구성 방안)

  • Nam, Kwijung;Kim, Junsik;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.27 no.4
    • /
    • pp.548-560
    • /
    • 2022
  • Video based Point Cloud Compression (V-PCC) is one of the compression methods for compressing point clouds, and shows high efficiency in dynamic point cloud compression with movement due to the feature of compressing point cloud data using an existing video codec. Accordingly, V-PCC is drawing attention as a core technology for immersive content services such as AR/VR. In order to effectively service these V-PCC contents through a media streaming platform, it is necessary to encapsulate them in the existing media file format, ISO based Media File Format (ISOBMFF). However, in order to service through an adaptive streaming platform such as Dynamic Adaptive Streaming over HTTP (DASH), it is necessary to encode V-PCC contents of various qualities and store them in the server. Due to the size of the 2D media, it causes a great burden on the encoder and the server compared to the existing 2D media. As a method to solve such a problem, it may be considered to configure a streaming platform based on content obtained through V-PCC content encoding based on SHVC. Therefore, this paper encapsulates the SHVC-based V-PCC bitstream into ISOBMFF suitable for DASH service and proposes a configuration method to service it. In addition, in this paper, we propose ISOBMFF encapsulation and DASH configuration method to effectively service SHVC-based V-PCC contents, and confirm them through verification experiments.

An Embedding /Extracting Method of Audio Watermark Information for High Quality Stereo Music (고품질 스테레오 음악을 위한 오디오 워터마크 정보 삽입/추출 기술)

  • Bae, Kyungyul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.21-35
    • /
    • 2018
  • Since the introduction of MP3 players, CD recordings have gradually been vanishing, and the music consuming environment of music users is shifting to mobile devices. The introduction of smart devices has increased the utilization of music through music playback, mass storage, and search functions that are integrated into smartphones and tablets. At the time of initial MP3 player supply, the bitrate of the compressed music contents generally was 128 Kbps. However, as increasing of the demand for high quality music, sound quality of 384 Kbps appeared. Recently, music content of FLAC (Free License Audio Codec) format using lossless compression method is becoming popular. The download service of many music sites in Korea has classified by unlimited download with technical protection and limited download without technical protection. Digital Rights Management (DRM) technology is used as a technical protection measure for unlimited download, but it can only be used with authenticated devices that have DRM installed. Even if music purchased by the user, it cannot be used by other devices. On the contrary, in the case of music that is limited in quantity but not technically protected, there is no way to enforce anyone who distributes it, and in the case of high quality music such as FLAC, the loss is greater. In this paper, the author proposes an audio watermarking technology for copyright protection of high quality stereo music. Two kinds of information, "Copyright" and "Copy_free", are generated by using the turbo code. The two watermarks are composed of 9 bytes (72 bits). If turbo code is applied for error correction, the amount of information to be inserted as 222 bits increases. The 222-bit watermark was expanded to 1024 bits to be robust against additional errors and finally used as a watermark to insert into stereo music. Turbo code is a way to recover raw data if the damaged amount is less than 15% even if part of the code is damaged due to attack of watermarked content. It can be extended to 1024 bits or it can find 222 bits from some damaged contents by increasing the probability, the watermark itself has made it more resistant to attack. The proposed algorithm uses quantization in DCT so that watermark can be detected efficiently and SNR can be improved when stereo music is converted into mono. As a result, on average SNR exceeded 40dB, resulting in sound quality improvements of over 10dB over traditional quantization methods. This is a very significant result because it means relatively 10 times improvement in sound quality. In addition, the sample length required for extracting the watermark can be extracted sufficiently if the length is shorter than 1 second, and the watermark can be completely extracted from music samples of less than one second in all of the MP3 compression having a bit rate of 128 Kbps. The conventional quantization method can extract the watermark with a length of only 1/10 compared to the case where the sampling of the 10-second length largely fails to extract the watermark. In this study, since the length of the watermark embedded into music is 72 bits, it provides sufficient capacity to embed necessary information for music. It is enough bits to identify the music distributed all over the world. 272 can identify $4*10^{21}$, so it can be used as an identifier and it can be used for copyright protection of high quality music service. The proposed algorithm can be used not only for high quality audio but also for development of watermarking algorithm in multimedia such as UHD (Ultra High Definition) TV and high-resolution image. In addition, with the development of digital devices, users are demanding high quality music in the music industry, and artificial intelligence assistant is coming along with high quality music and streaming service. The results of this study can be used to protect the rights of copyright holders in these industries.