Browse > Article
http://dx.doi.org/10.9708/jksci.2016.21.9.001

Instruction Flow based Early Way Determination Technique for Low-power L1 Instruction Cache  

Kim, Gwang Bok (School of Electronics and Computer Engineering, Chonnam National University)
Kim, Jong Myon (School of Electrical Engineering, University of Ulsan)
Kim, Cheol Hong (School of Electronics and Computer Engineering, Chonnam National University)
Abstract
Recent embedded processors employ set-associative L1 instruction cache to improve the performance. The energy consumption in the set-associative L1 instruction cache accounts for considerable portion in the embedded processor. When an instruction is required from the processor, all ways in the set-associative instruction cache are accessed in parallel. In this paper, we propose the technique to reduce the energy consumption in the set-associative L1 instruction cache effectively by accessing only one way. Gshare branch predictor is employed to predict the instruction flow and determine the way to fetch the instruction. When the branch prediction is untaken, next instruction in a sequential order can be fetched from the instruction cache by accessing only one way. According to our simulations with SPEC2006 benchmarks, the proposed technique requires negligible hardware overhead and shows 20% energy reduction on average in 4-way L1 instruction cache.
Keywords
Instruction Cache; Low-power; Embedded Processor; Gshare Predictor;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Sodani, and Processor, C. A. M, "Race to Exascale: Opportunities and Challenges," In MICRO 2011 Keynote, 2011.
2 NVIDIA Tegra 4 Family CPU Architecture, NVIDIA, Tech. Rep., 2013. [Online]. Available: http://www.nvidia.com/docs/IO/116757/NVIDIA_Quad_a15_whitepaper_FINALv2.pdf".
3 A. Sembrant, E. Hagersten, and D. Black-Shaffer, "TLC: A Tag Less Cache for Reducing Dynamic First Level Cache Energy," in Proc. of IEEE/ACM International Symposium on Microarchitecture, pp. 49-61, 2013.
4 M. D. Powell, A. Agarwal, T. vijaykumar, B. Falsafi, and K. Roy, "Reducing Set-associative Cache Energy via Way-Prediction and Selective Direct-mapping," in MICRO, pp. 54-65, 2001.
5 W. Zhang, H. Zhang, and J. Lach, "Reducing Dynamic Energy of Set-associative L1 Instruction Cache by Early Tag Lookup,", Low Power Electronics and Design, pp.49-54, 2015.
6 J. Dai, M. Guan, and L. Wang, "Exploiting Early Tag Access for Reducing L1 data cache energy in embedded processors," IEEE Transactions on Very Large Scale Integration Systems, Vol. 22, NO. 2, pp.396-407, 2014.   DOI
7 C. Zhang, F. Vahid, J. yang, and W. najjar, "A Way-Halting Cache for Low-Energy High-Performance Systems," ACM Transactions on Architecture and Code optimization, Vol. 2, No. 1, pp.34-54, 2005.   DOI
8 J. Dai, and L. Wang, "An Energy-Efficient L2 Cache Architecture using Way Tag Information under Write-through Policy," IEEE Transactions on Very Large Scale Integration Systems, Vol.21, No. 1, pp. 102-112, 2013.   DOI
9 D. Sanchez, and C. Kozyrakis, "The ZCache: Decoupling Ways and Associativity," In Microarchitecture, pp. 187-198, 2010.
10 A. Seznec, "A Case for Two-Way Skewed-Associative Caches," In ACM SIGARCH Computer Architecture News, Vol.21, No. 2, pp. 169-178, 1993.
11 T. Austin, E., Larson, and D. Ernst, "SimpleScalar: An Infrastructure for Computer System Modeling," Computer, Vol.35, No.2, pp. 59-67, 2002.   DOI
12 A. Seznec, and F. Bodin, "Skewed-Associative Caches," Parallel Architectures and Languages Europe, pp. 305-316, 1993.
13 C.L Yang, and C. L., "Hotspot Cache: Joint Temporal and Spatial Locality Exploitation for I-cache Energy Reduction," Low Power Electronics and Design, pp. 114-119, 2004.
14 J. Ye, H. Ding, Y. Hu, and T. Watanabe, "A Behavior-based Adaptive Access-Mode for Low-Power Set-Associative Caches in Embedded systems," Jornal of Information processing, Vol.20, No. 1, pp. 26-36, 2012.   DOI
15 A. ma, M. Zhang and K. Asanovic, "Way Memorization to Reduce Fetch Energy in Instruction Caches," ISCA Workshop on Complexity Effective Design, Vol.20, pp. 31, 2001.
16 C. H. Kim, S. W. Chung, and C. S Jhon, "A Power-aware Branch Predictor by Accessing BTB Selectively," Jornal of Computer Science and Technology, Vol.20, No.5, pp. 607-614, 2005.   DOI
17 Wattch, http://www.eecs.harvard.edu/-dbrooks/
18 SPEC Benchmark Suite. Information available at http://spec.org/cpu2006/
19 SPEC CPU2000 Benchmarks, http://www.specbench.org
20 N. Muralimanohar, R. Balasubramonian, and N. P. Jouppi, "CACTI 6.0: A Tool to Model Large Caches," Technical Report HPL-2009-85, Hewlett Packard Laboratories, 2009.