Browse > Article
http://dx.doi.org/10.3745/KIPSTA.2004.11A.7.489

Design and Performance Evaluation of Expansion Buffer Cache  

Hong Won-Kee (대구대학교 정보통신공학부)
Abstract
VLIW processor is considered to be an appropriate processor for the embedded system, provided with high performance and low power con-sumption due to its simple hardware structure. Unfortunately, the VLIW processor often suffers from high memory access latency due to the variable length of I-packets, which consist of independent instructions to be issued in parallel. It is because of the variable I-packet length that some I-packets must be placed over two cache blocks, which are called straddle I-packets, so that two cache accesses are required to fetch such I-packets. In this paper, an expansion buffer cache is proposed to improve not only the instruction fetch bandwidth, but also the power consumption of the I-cache with moderate hardware cost. The expansion buffer cache has a small expansion buffer containing a fraction of a straddle packet along with the main cache to reduce the additional cache accesses due to the straddle I-packets. With a great reduction in the cache accesses due to the straddle packets, the expansion buffer cache can achieve $5{\~}9{\%}$improvement over the conventional I-caches in the $Delay{\cdot}Power{\cdot}Area$ metric.
Keywords
VLIW; Embedded System; Instruction Cache; Instruction Fetch Band-width; Power Consumption;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Banerjia, K. N. Menzes and T. M. Conte, 'NextPC computation for a banked instruction cache for a VLIW architecture with a compressed encoding,' Technical Report. Dept. of Electrical and Computer Engineering, North Carolina State University, June, 1996
2 M. B. Kamble and K Ghose, 'Energy-efficiency of VLSI cache: a comparative study,' in Proc. of Int'l Conf. on VLSI Design, pp.261-267, Jan., 1997   DOI
3 J. M. Mulder, N. T. Quach and M. J. Flynn, 'An area model for on-chip memories and its application,' IEEE Journal of Solid-State Circuits, Vol.26, No.2, pp.98-106, Feb., 1991   DOI   ScienceOn
4 M. Horowitz, T. Indermaur and R. Gonzalez, 'Low-power digital design,' in Proc. of IEEE Symp. Low Power Electron, pp.8-11, Oct., 1994
5 W. Tang, R. Gupta and A. Nicolau, 'Power savings in embedded processors through decode filter cache,' in Proc. of Int. Conf. on Design Automation & Test in Europe, pp.443-448, Mar., 2002   DOI
6 T. M. Conte and et al., 'Instruction fetch mechanisms for VLIW architectures with compressed encodings,' in Proc. of Int'l Symp. on Microarchitecture, pp.201- 211, Dec., 1996   DOI
7 N. S. Kim and et al., 'Leakage Current: Moore's Law Meets Static Power,' IEEE Computer, pp.68-75, Dec., 2003   DOI   ScienceOn
8 P. P. Chang, S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu, 'IMPACT: An architectural frame-work for multiple-instruction-issue processors,' in Proc. of Int'l Symp. on Computer Architecture, pp.266-275, May, 1991   DOI
9 T. Y. Yeh and Y. N. Part, 'A comparison of dynamic branch predictors that use two levels of branch history,' in Proc. of Int'l. Symp. on Computer Architecture, pp. 257-266, 1993   DOI
10 T. Conte, K. Menezes, P. Millis and B. Patell, 'Optimization of instruction fetch mechanism for high issue rates,' in Proc. of Int'l Symp. on Computer Architecture, pp.333-344, June, 1995   DOI
11 P. Grun, N. Dutt and A. Nicolau, 'Access Pattern Based Local Memory Customization for Low Power Embedded Systems,' in Proc. of Design Automation and Test in Europe Conference, Mar., 2001   DOI
12 H. Michael and et aI., 'L1 data cache decomposition for energy efficiency,' in Proc. of Int Symp. on Low Power Electronics and Design, pp.10-15, 2001   DOI
13 D. H. Friendly, S. J. Patel and Y. N. Patt, 'Alternative fetch and issue policies for the trace cache fetch mechanism,' in Proc. of Int'l Symp. on Microarchitecture, pp.24-33, Dec., 1997   DOI
14 E. Hao, P.-Y. Chang, M. Evers and Y. Patt, 'Increasing the instruction fetch rate via block-structured instruction set architectures,' in Proc. of Int'l Symp. on Microarchitecture, pp.191- 200, Dec., 1996   DOI
15 A. Seznec, S. Jourdan, P. Sainrat and P. Michaud, 'Multiple block ahead branch predictor,' in Proc. of Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, 1996   DOI
16 A. Klauser and D. Grunwald, 'Instruction fetch mechanisms for multipath execution processors,' in Proc. of Int'l Symp. on Microarchitecture, pp.38-47, Dec., 1999   DOI
17 E. Rottenberg, S. Benett and J. E. Smith, 'Trace cache : a low latency approach to high bandwidth instruction fetching,' in Proc. of Int'l Symp. on Microarchitecture, pp.24-34, Dec., 1996   DOI
18 L. Geppert and T. Perry, 'Transmeta's magic show,' IEEE Spectrum, pp.26-33, May, 2000   DOI   ScienceOn