An arry circuit is designed for parallel computation of vector-radix 2-D discrete cosine transform (VR-FCT) which is a fast algorithm of DCT. By using a 2-D array of processing elements (PEs), the butterfly structure of the VR-FCT can be efficiently implemented with high condurrency and local communication geometry. The proposed implementation features architectural medularity, regularity and locality, so that it is very suitable for VLSI realization. Also, no transposition memory is required. The array core for (8$\times$8) 2-D DCT, which is designed usign ISRC 1.5.mu.m N-Well CMOS technology, consists of 64 PEs arranged in (8$\times$8) 2-D array and contains about 98,000 transistors on an area of 138mm$^{2}$. From simulation results, it is estimated that (8$\times$8) 2-D DCT can be computed in about 0.88 .mu.sec at 50 MHz clock frequency, resulting in the throughput rate of about 72${\times}10^[6}$ pixels per second.