Abstract
In this paper, we propose a parallel intra-operation unit and a memory architecture for improving the performance of intra-prediction, which utilizes spatial correlation in an image to predict the blocks and contains 17 prediction modes in total. The design is targeted for portable devices applying H.264/AVC decoders. For boosting the performance of the proposed design, we adopt a parallel intra-operation unit that can achieve the prediction of 16 neighboring pixels at the same time. In the best case, it can achieve the computation of one luma $16{\times}16$ block within 16 cycles. For one luma $4{\times}4$ block, a mere one cycle is needed to finish the process of computation. Compared with the previous designs, the average cycle reduction rate is 78.01%, and the gate count is slightly reduced. The design is synthesized with the MagnaChip $0.18{mu}m$ library and can run at 125 MHz.