Ⅰ. Introduction
The Joint Collaborative Team on Video Coding (JCT- VC) which is composed by ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Pictures Expert Group (MPEG) developed the newest video compression standard, called High Efficiency Video Coding (HEVC)/ H.265[1]. The aim of HEVC is to achieve bit rate reduction of 50% compared to previous video coding standard, H.264/AVC to support high quality video services such as ultra-high-definition television (UHDTV).
In HEVC, a picture is divided into square-shaped large coding units (LCUs) which correspond to CU at depth of 0. Using quad-tree structures, each LCU is recursively split into four CUs and its depth increases by one at a time as shown in Fig. 1. A CU can have sizes from 8x8 to 64x64, and its maximum depth (for luma) can be up to 3. The prediction units (PUs) are for prediction process such as intra and inter prediction. There are eight different ways in splitting a CU into PUs. A PU can be a single CU or can be partitioned into two or four square or rectangular PUs. In case of intra prediction, if CU is of 8x8 size, it can have 2 cases in splitting: a 8x8 PU and four 4x4 PUs and otherwise, it can only have one case having the same size[2]. The HEVC intra prediction has three different prediction modes: 33 distinct angular prediction modes, planar prediction mode, and DC prediction mode.
Fig. 1.Splitting 64x64 LCUs into CUs of 8x8 to 32x32 luma samples using quad-tree structure 그림 1. Quad-tree 구조를 사용하여 64x64 크기의 LCU를 32x32부터 8x8 크기의 CU로 분할하는 방법
HEVC selects the best prediction mode among all possible CU sizes and 35 intra prediction modes to find the least rate distortion cost (RD-cost). In this process, it requires higher computational complexity. Many fast intra prediction methods have been proposed recently to achieve significant time saving with little loss in coding per- formance. Some of them use fast algorithm for HEVC intra prediction[3]-[10]: the fast intra mode decision using information of previous PUs and transform units (TUs) based on hierarchical structure[3], the method in rate-distortion optimization (RDO) process searching for minimum intra prediction mode using histogram based on gradient[4], the method only using adjacent modes of optimum direction without modes of impossible directions in measuring total gradient from four directions of Coding Tree Unit (CTU)[5], the intra prediction mode decision using edge in CU[6], the CU splitting and pruning method using Bayesian selection[7], the early skip mode decision without checking the rest according to rate distortion cost of Merge mode[8], the fast quantization method in RDO process choosing only optimum level of quantization[9], the fast mode decision for intra prediction with direction of the edges and neighboring CUs modes[10]. There are also fast algorithms for HEVC intra prediction specially using fast CU size decision [11]-[13]: the fast CU size decision with the texture homogeneity and bypass strategy[11], the fast CU size decision using skipping some specific depth based on RD cost and intra prediction mode correlations among spatially nearby CUs[12], the fast CU depth decision with keypoint based on blob detection[13].
It is important to select a right depth for a CU because it directly affects the number of bits – when depth increases by one, a CU is split into four smaller CUs having a half size each, and it needs more number of bits as the number of CUs increases. So, selecting a smaller CU depth is more advantageous to encode video from the point of view of bit amounts. But distortion between original and reconstructed sequences is also related to CU depth. If the smallest CU depth is only selected, it is hard to predict a block using reference samples. This paper proposes a fast CU mode decision algorithm using the FAST (Features from Accelerated Segment Test) corner detector which can estimate homogeneity of current CU and decide the minimum CU depth.
The rest of this paper is as follows: Section Ⅱ introduces the FAST corner detection and the proposed fast CU depth decision method with related thresholds. Section Ⅲ shows the simulation results and performance of the proposed method. Finally, brief conclusion is given in Section Ⅳ.
Ⅱ. Proposed Fast CU Depth Decision
1. FAST Corner Detection
The FAST (Features from Accelerated Segment Test) is a corner detection method which extracts features with high speed[14]. In the FAST, a pixel is determined whether it is corner or not considering a circle of its surrounding 16 pixels. Technically, in Fig. 2, a pixel, p, is determined as a corner if all contiguous pixels in the circle of 16 pixels are brighter than Ip + Th or darker than Ip - Th where Ip is the intensity of selected pixel and Th is a threshold value. For high-speed test which is used to exclude a lot of non-corners, it examines only the four pixels at 1, 9, 5 and 13: pixels at 1 and 9 are tested first whether they are brighter or darker than the selected pixel; If so, then it further checks pixels at 5 and 13. p is considered as a corner if at least three of these pixels are brighter than Ip + Th or darker than Ip - Th. Otherwise, p cannot be a corner. The FAST corner detection is good for detecting feature points rapidly with high performance. As shown in Fig. 3, a CU tends to have a larger depth if it has many feature points. Otherwise, it tends to have a small depth. So if the degree of frequency is measured for a LCU using the FAST corner detection, its depth can be decided.
Fig. 2.FAST corner detection (white dash lines indicates contiguous pixels which meet specified conditions) 그림 2. FAST를 이용한 코너 검출 방법 (점선은 조건에 만족하는 픽셀들을 표시함)
Fig. 3.The relationship between the number of feature points and CU splitting. (Colored circles indicate feature points) 그림 3. 특징점과 CU 분할의 관계 (색깔 원은 특징점을 나타냄)
2. Fast CU Depth Decision
In HEVC, intra prediction uses all CU sizes from 8x8 to 64x64 and 35 intra prediction modes so that it has critical impact on the encoding time. To reduce encoding time, the proposed fast CU Depth decision (FCDD) focuses on prediction. As shown in Fig. 4, the proposed algorithm has two stages: the counting feature points stage (CFS) and the depth decision stage (DDS). First of all, in the CFS, all feature points in a LCU is detected using the FAST corner detection in which its threshold value plays an important role in determining whether a pixel is feature point or not. But the FAST uses the original pixel values in deciding CU depths. So it is independent of QP value. In HEVC, as the QP value influences the quality of reconstructed picture, the number of feature points in the same LCU is changed if QP value is changed. Therefore, we make the threshold value adaptive to have correlation with the quality of reconstructed picture: we confirm the number of feature points with different QP value in reconstructed picture and with different threshold value. Compared to original picture, we confirm how many number of feature points disappear in reconstructed picture and find out the tendency to have the correlation between the threshold value and QP value experimentally. Finally, we find the relation between the threshold and QP value as below:
Fig. 4.Flowchart of the proposed FCDD algorithm 그림 4. 고속 CU 깊이 결정 알고리즘의 흐름도
It counts pixels which are considered as feature points to check the texture homogeneity. When the number of feature points is large, it means that the current LCU has high frequency energy much. The homogeneity of LCU is related to its depth: if the texture of the current LCU is complex, it needs higher depth to keep the intra prediction performance higher in smaller CU; and if homogeneous, it is possible to have a smaller depth in larger CU. Therefore, by counting feature points, the degree of partition, or depth, can be decided. We check not only the number of feature points but also how feature points are dispersed in the current CU. The distribution of feature points are checked with the variance of distances between feature points and centre point which is measured by the average of positions of feature points as below:
where Cenx and Ceny are the x and y positions of centre point respectively, px and py are those of a feature point respectively, N is the number of feature points, and σ2 is the variance of distances between feature points and its centre point. As shown in Table. 1, there are two data sets with 32x32 CU and QP=22 in HEVC: one case (see the first and second rows) is that two CUs having the same number of feature points and similar variance have different result on whether they split or not; another case (see the third to fifth rows) is that three CUs with the same number of feature points but different variance have different result on whether they split or not, specifically there is no relation between variance value and whether it split or not. So, we consider only the number of feature points in determining the depth of CU.
Table 1.Example of whether it splits or not in case of same feature points and similar variance value 표 1. 동일한 특징점과 유사한 분산값에서 분할 여부가 다른 경우의 예
The depth decision stage (DDS) determines the depth of the current LCU through three depth thresholds: the first depth threshold (TH01) to decide whether a 64x64 CU, same as depth 0, splits or not; the second depth threshold (TH12) to decide whether a 32x32 CU, same as depth 1, splits or not; the last depth threshold (TH23) to decide whether a 16x16 CU, same as depth 2, splits or not. We count the number of feature points on the current CU and its possibility of being split.
As shown in Fig. 5, in the depth 0 to 1 and depth 1 to 2 which have similar tendency on the graph, the current CU is not split when the number of feature points is equal to zero or near zero in case of high QP value, but almost split when we select larger case if non split and split cases are overlapped by less than 30%. In the depth 2 to 3, it is really hard to decide the TH23 value because both cases are overlapped by more than 30% except when the number of feature points is equal to zero. So, when the number of feature points is more than TH23, we need to use the conventional method, the rate distortion optimization (RD) process, to decide the depth of current CU. To find the three depth thresholds, we used following two of sequences in Class C: 10 frames of “BQMall” and “BasketballDrill”, respectively. Finally we decided the depth threshold value as in Table 2.
Fig. 5.The relation between the number of feature points and whether split or not on the current CU 그림 5. 현재 CU에 대한 특징점의 개수와 분할 여부의 관계
Table 2.Depth thresholds according to QP 표 2. QP에 따른 깊이 문턱치 값
Ⅲ. Experiment Result
The proposed FCDD algorithm is implemented on the HEVC reference software (HM 16.7) for performance evaluation to compare it with the original reference HEVC encoder as the anchor. The simulation platform is Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz with quad cores and 8.00 GB RAM. All-Intra-Main profile is used for encoding. Simulation conditions are defined as follow: 100 frames; QPs are set to 22, 27, 32 and 37. Coding efficiency is measured using BDPSNR and BDBR[15], and the reduction of computational complexity is measured using the time saving of encoding as follows:
Table 3 shows performance results of the proposed FCDD algorithm which is shown to reduce the encoding time by about 53.73% with loss of coding efficiency of 0.7% in BDBR on average. This is because the proposed method uses the number of feature points based on corner detection for skipping the RDO process required for LCU splitting decision. It can be found that the loss of coding efficiency is not large considering fast intra prediction with a minimum of BDBR 0.0% on “ParkScene”, and maximum of BDBR 1.7% on “Basketball Drive”. However, although FCDD does not require the full RDO process, it still needs additional computation to carry out the FAST corner detection process. So, it achieved the minimum time saving of 47.6%.
Table 3.Performance of the proposed FCDD algorithm compared to HM 16.0 encoder (anchor) 표 3. HM 16.0 부호화기와 비교한 제안된 FCDD 알고리즘의 성능
IV. Conclusion
In this paper, we proposed a fast CU mode decision algorithm for HEVC intra prediction by using the FAST corner detection and determining range of depth on LCU size. Experimental results showed that the proposed method can reduce about 53.73 % of the computational time at encoder and also maintain the coding performance with 0.7% BDBR loss. We need the proposed fast algorithm to overcome high computational complexity with only small loss of coding efficiency. So, the proposed algorithm can be effective for retaining the coding performance as the required computational time grows to support higher definition video. The proposed method can alleviate the increasing computation.
References
- G. J. Sullivan and J. R. Ohm, “Recent developments in standardization of high efficiency video coding (HEVC),” Proc. SPIE, vol. 7798, p.77980V, Sep.2010.
- I.-K. Kim, J. Min, T. Lee, W.-J. Han, and J. Park, “Block partitioning structure in the HEVC standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1697–1706, Dec. 2012. https://doi.org/10.1109/TCSVT.2012.2223011
- J. Kim, J. Y. Yang, H. Y. Lee, and B. Jeon, “Fast Intra Mode Decision of HEVC based on Hierarchical Structure,” 8th International Conference on Information, Communication and Signal Processing, pp. 1-4, Dec. 2011.
- W. Jiang, H. J. Ma, and Y. W. Chen, “Gradient Based Fast Mode Decision Algorithm for Intra Prediction in HEVC,” Proc. 2nd International Conference on Consumer Electronics, Communications and Networks, pp. 1836-1840, Apr. 2012.
- Y. Zhang, Z. Li, and B. Li, “Gradient-based fast decision for intra prediction in HEVC,” Proc. IEEE Vis. Commun. Image Process. (VCIP), pp. 1–6, Nov. 2012.
- G. Chen, Z. Liu, T. Ikenaga, and D. Wang, “Fast HEVC intra mode decision using matching edge detector and kernel density estimation alike histogram generation,” Proc. IEEE ISCAS, pp. 53–56, May 2013.
- S. Chao and M. Kim, “Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 9, pp. 1555–1563, Sep. 2013. https://doi.org/10.1109/TCSVT.2013.2249017
- Hoyoung Lee, Huik Jae Shim, Younghyun Park, and Byeungwoo Jeon, “Early Skip Mode Decision for Fast HEVC Encoder,” IEEE Trans. Broadcasting, vol. 61, No.3, pp.388-397, Sept. 2015. https://doi.org/10.1109/TBC.2015.2419172
- Hoyoung Lee, Seungha Yang, Younghyun Park, and Byeungwoo Jeon, “Fast Quantization Method with Simplified Rate-Distortion Optimized Quantization for HEVC Encoder,” IEEE.Trans. Circ. and Syst. for Video Technology (Sepcial issue on HEVC Implementation), VOL. 26, No.1, pp.107-116, Jan. 2016. https://doi.org/10.1109/TCSVT.2015.2450151
- F. Pan and et al., “Fast mode decision algorithm for intra prediciton in H.264/AVC video coding.” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 813–822, Jul. 2005. https://doi.org/10.1109/TCSVT.2005.848356
- L. Shen, Z. Zhang, and Z. Liu, “Effective CU size decision for HEVC intracoding,” IEEE Trans. Image Process, 23, 10, pp. 4232–4241, 2014. https://doi.org/10.1109/TIP.2014.2341927
- L. Shen, Z. Zhang, and P. An, “Fast CU size decision and mode decision algorithm for HEVC intra coding,” IEEE Trans. Consum. Electron., vol. 59, no. 1, pp. 207–213, Feb. 2013. https://doi.org/10.1109/TCE.2013.6490261
- N. Kim, S.-C. Kim, H. Ko, B. Jeon, "Keypoint-based Fast CU Depth Decision for HEVC Intra Coding.", Journal of The Institute of Electronics and Information Engineers, vol. 53, no. 2, Feb. 2016.
- E. Rostern, and T. Drummond, “Machine learning for high-speed corner detection,“ Computer Vision-ECCV, vol. 3951, pp. 430-443, 2006.
- G. Bjontegaard, “Calculation of average PSNR differnece between RD-curves,” 13th VCEG Meeting, VCEG-M33, Austin, TX, USA, Apr. 2001.
Cited by
- Research on conditional characteristics vision real-time detection system for conveyor belt longitudinal tear vol.11, pp.7, 2017, https://doi.org/10.1049/iet-smt.2017.0100