• Title/Summary/Keyword: Bitrate

Search Result 246, Processing Time 0.02 seconds

An Intra Prediction Method and Fast Intra Prediction Method in Inter Frames using Block Content and Dependency Probabilities on neighboring Block Modes in H.264|AVC (영상 내용 특성과 주위 블록 모드 상관성을 이용한 H.264|AVC 화면 간 프레임에서의 화면 내 예측 부호화 결정 방법과 화면 내 예측 고속화 방법)

  • Na, Tae-Young;Lee, Bum-Shik;Hahm, Sang-Jin;Park, Chang-Seob;Park, Keun-Soo;Kim, Mun-Churl
    • Journal of Broadcast Engineering
    • /
    • v.12 no.6
    • /
    • pp.611-623
    • /
    • 2007
  • The H.264|AVC standard incorporates an intra prediction tool into inter frame coding. However, this leads to excessive amount of increase in encoding time, thus resulting in the difficulty in real-time implementation of software encoders. In this paper, we first propose an early decision on intra prediction coding and a fast intra prediction method using the characteristics of block contents and the context of neighboring block modes for the intra prediction in the inter frame coding of H.264/AVC. Basically, the proposed methods determine a skip condition on whether the $4{\times}4$ intra prediction is to be used in the inter frame coding by considering the content characteristics of each block to be encoded and the context of its neighboring blocks. The performance of our proposed methods is compared with the Joint Model reference software version 11.0 of H.264|AVC. The experimental results show that our proposed methods allow for 41.63% reduction in the total encoding time with negligible amounts of PSNR drops and bitrate increases, compared to the original Joint Model reference software version 11.0.

A H.264 based Selective Fine Granular Scalable Coding Scheme (H.264 기반 선택적인 미세입자 스케일러블 코딩 방법)

  • 박광훈;유원혁;김규헌
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.4
    • /
    • pp.309-318
    • /
    • 2004
  • This paper proposes the H.264-based selective fine granular scalable (FGS) coding scheme that selectively uses the temporal prediction data in the enhancement layer. The base layer of the proposed scheme is basically coded by the H.264 (MPEG-4 Part 10 AVC) visual coding scheme that is the state-of-art in codig efficiency. The enhancement layer is basically coded by the same bitplane-based algorithm of the MPEG-4 (Part 2) fine granular scalable coding scheme. In this paper, we introduce a new algorithm that uses the temproal prediction mechanism inside the enhancement layer and the effective selection mechanism to decide whether the temporally-predicted data would be sent to the decoder or not. Whenever applying the temporal prediction inside the enhancement layer, the temporal redundancies may be effectively reduced, however the drift problem would be severly occurred especially at the low bitrate transmission, due to the mismatch bewteen the encoder's and decoder's reference frame images. Proposed algorithm selectively uses the temporal-prediction data inside the enhancement layer only in case those data could siginificantly reduce the temporal redundancies, to minimize the drift error and thus to improve the overall coding efficiency. Simulation results, based on several test image sequences, show that the proposed scheme has 1∼3 dB higher coding efficiency than the H.264-based FGS coding scheme, even 3∼5 dB higher coding efficiency than the MPEG-4 FGS international standard.

Performance Analysis of Super-Resolution based Video Coding for HEVC (HEVC 기반 초해상화를 이용한 비디오 부호화 효율 성능 분석)

  • Ki, Sehwan;Kim, Dae-Eun;Jun, Ki Nam;Baek, Seung Ho;Choi, Jeung Won;Kim, Dong Hyun;Kim, Munchurl
    • Journal of Broadcast Engineering
    • /
    • v.24 no.2
    • /
    • pp.306-314
    • /
    • 2019
  • Since the resolutions of videos increase rapidly, there are continuing needs for effective video compression methods despite an increase in the transmission bandwidth. In order to satisfy such a demand, a reconstructive video coding (RVC) method by using a super resolution has been proposed. Since RVC reduces the resolution of the input video, when frames are compressed to the same size, the number of bits per pixel increases, thereby reducing coding artifacts caused by video coding. However, RVC method using super resolution is not effective in all target bitrates. Comparing the size of the loss generated while downsizing the resolution and the size of the loss caused by the video compression, only when the size of loss generated in the video compression is larger, RVC method can perform the improved compression performance compared to direct video coding. In particular, since HEVC has considerably higher compression performance than the previous standard video codec, it can be experimentally confirmed that the compression distortions become larger than the distortions of downsizing the resolution only in the very low-bitrate conditions. In this paper, we applied RVC based HEVC in various video types and measured the target bitrates that RVC method can be effectively applied.

Latent Shifting and Compensation for Learned Video Compression (신경망 기반 비디오 압축을 위한 레이턴트 정보의 방향 이동 및 보상)

  • Kim, Yeongwoong;Kim, Donghyun;Jeong, Se Yoon;Choi, Jin Soo;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.31-43
    • /
    • 2022
  • Traditional video compression has developed so far based on hybrid compression methods through motion prediction, residual coding, and quantization. With the rapid development of technology through artificial neural networks in recent years, research on image compression and video compression based on artificial neural networks is also progressing rapidly, showing competitiveness compared to the performance of traditional video compression codecs. In this paper, a new method capable of improving the performance of such an artificial neural network-based video compression model is presented. Basically, we take the rate-distortion optimization method using the auto-encoder and entropy model adopted by the existing learned video compression model and shifts some components of the latent information that are difficult for entropy model to estimate when transmitting compressed latent representation to the decoder side from the encoder side, and finally compensates the distortion of lost information. In this way, the existing neural network based video compression framework, MFVC (Motion Free Video Compression) is improved and the BDBR (Bjøntegaard Delta-Rate) calculated based on H.264 is nearly twice the amount of bits (-27%) of MFVC (-14%). The proposed method has the advantage of being widely applicable to neural network based image or video compression technologies, not only to MFVC, but also to models using latent information and entropy model.

An Embedding /Extracting Method of Audio Watermark Information for High Quality Stereo Music (고품질 스테레오 음악을 위한 오디오 워터마크 정보 삽입/추출 기술)

  • Bae, Kyungyul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.21-35
    • /
    • 2018
  • Since the introduction of MP3 players, CD recordings have gradually been vanishing, and the music consuming environment of music users is shifting to mobile devices. The introduction of smart devices has increased the utilization of music through music playback, mass storage, and search functions that are integrated into smartphones and tablets. At the time of initial MP3 player supply, the bitrate of the compressed music contents generally was 128 Kbps. However, as increasing of the demand for high quality music, sound quality of 384 Kbps appeared. Recently, music content of FLAC (Free License Audio Codec) format using lossless compression method is becoming popular. The download service of many music sites in Korea has classified by unlimited download with technical protection and limited download without technical protection. Digital Rights Management (DRM) technology is used as a technical protection measure for unlimited download, but it can only be used with authenticated devices that have DRM installed. Even if music purchased by the user, it cannot be used by other devices. On the contrary, in the case of music that is limited in quantity but not technically protected, there is no way to enforce anyone who distributes it, and in the case of high quality music such as FLAC, the loss is greater. In this paper, the author proposes an audio watermarking technology for copyright protection of high quality stereo music. Two kinds of information, "Copyright" and "Copy_free", are generated by using the turbo code. The two watermarks are composed of 9 bytes (72 bits). If turbo code is applied for error correction, the amount of information to be inserted as 222 bits increases. The 222-bit watermark was expanded to 1024 bits to be robust against additional errors and finally used as a watermark to insert into stereo music. Turbo code is a way to recover raw data if the damaged amount is less than 15% even if part of the code is damaged due to attack of watermarked content. It can be extended to 1024 bits or it can find 222 bits from some damaged contents by increasing the probability, the watermark itself has made it more resistant to attack. The proposed algorithm uses quantization in DCT so that watermark can be detected efficiently and SNR can be improved when stereo music is converted into mono. As a result, on average SNR exceeded 40dB, resulting in sound quality improvements of over 10dB over traditional quantization methods. This is a very significant result because it means relatively 10 times improvement in sound quality. In addition, the sample length required for extracting the watermark can be extracted sufficiently if the length is shorter than 1 second, and the watermark can be completely extracted from music samples of less than one second in all of the MP3 compression having a bit rate of 128 Kbps. The conventional quantization method can extract the watermark with a length of only 1/10 compared to the case where the sampling of the 10-second length largely fails to extract the watermark. In this study, since the length of the watermark embedded into music is 72 bits, it provides sufficient capacity to embed necessary information for music. It is enough bits to identify the music distributed all over the world. 272 can identify $4*10^{21}$, so it can be used as an identifier and it can be used for copyright protection of high quality music service. The proposed algorithm can be used not only for high quality audio but also for development of watermarking algorithm in multimedia such as UHD (Ultra High Definition) TV and high-resolution image. In addition, with the development of digital devices, users are demanding high quality music in the music industry, and artificial intelligence assistant is coming along with high quality music and streaming service. The results of this study can be used to protect the rights of copyright holders in these industries.

A Fast 4X4 Intra Prediction Method using Motion Vector Information and Statistical Mode Correlation between 16X16 and 4X4 Intra Prediction In H.264|MPEG-4 AVC (H.264|MPEG-4 AVC 비디오 부호화에서 움직임 벡터 정보와 16~16 및 4X4 화면 내 예측 최종 모드간 통계적 연관성을 이용한 화면 간 프레임에서의 4X4 화면 내 예측 고속화 방법)

  • Na, Tae-Young;Jung, Yun-Sik;Kim, Mun-Churl;Hahm, Sang-Jin;Park, Chang-Seob;Park, Keun-Soo
    • Journal of Broadcast Engineering
    • /
    • v.13 no.2
    • /
    • pp.200-213
    • /
    • 2008
  • H.264| MPEG-4 AVC is a new video codingstandard defined by JVT (Joint Video Team) which consists of ITU-T and ISO/IEC. Many techniques are adopted fur the compression efficiency: Especially, an intra prediction in an inter frame is one example but it leads to excessive amount of encoding time due to the decision of a candidate mode and a RDcost calculation. For this reason, a fast determination of the best intra prediction mode is the main issue for saving the encoding time. In this paper, by using the result of statistical relation between intra $16{\times}16$ and $4{\times}4$ intra predictions, the number of candidate modes for $4{\times}4$ intra prediction is reduced. Firstly, utilizing motion vector obtained after inter prediction, prediction of a block mode for each macroblock is made. If an intra prediction is needed, the correlation table between $16{\times}16$ and $4{\times}4$ intra predicted modes is created using the probability during each I frame-coding process. Secondly, using this result, the candidate modes for a $4{\times}4$ intra prediction that reaches a predefined specific probability value are only considered in the same GOP For the experiments, JM11.0, the reference software of H.264|MPEG-4 AVC is used and the experimental results show that the encoding time could be reduced by 51.24% in maximum with negligible amounts of PSNR drop and bitrate increase.