Browse > Article
http://dx.doi.org/10.7840/kics.2013.38A.12.1094

An Improved Dynamic Branch Predictor by Selective Access of a Specific Element in 4-Way Cache  

Hwang, In-Sung (서강대학교 전자공학과 CAD & ES 연구실)
Hwang, Sun-Young (서강대학교 전자공학과 CAD & ES 연구실)
Abstract
This paper proposes an improved branch predictor that reduces the number execution cycles of applications by selectively accessing a specific element in 4-way associative cache. When a branch instruction is fetched, the proposed branch predictor acquires a branch target address from the selected element in the cache by referring to MRU buffer. Branch prediction rate and application execution speed are considerably improved by increasing the number of BTAC entries in restricted power condition, when compared with that of previous branch predictor which accesses all elements. The effectiveness of the proposed dynamic branch predictor is verified by executing benchmark applications on the core simulator. Experimental results show that number of execution cycles decreases by an average of 10.1%, while power consumption increases an average of 7.4%, when compared to that of a core without a dynamic branch predictor. Execution cycles are reduced by 4.1% in comparison with a core which employs previous dynamic branch predictor.
Keywords
Embedded System; Branch Prediction; MRU Buffer; BTAC; MDL;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 H. Lee and S. Hwang, "Design of a high-level synthesis system for automatic generation of pipelined datapath," J. Inst. Electron. Eng. Korea (IEEK), vol. 31-A, no. 4, pp. 53-67, Mar. 1994.   과학기술학회마을
2 J. Cho, Y. Yoo, and S. Hwang, "Construction of an automatic generation system of embedded processor cores," J. Korean Inst. Commun. Inform. Sci, (KICS), vol. 30, no. 6A, pp. 526-534, June 2005.   과학기술학회마을
3 ARM, ARM922T Technical Reference Manual (rev 0), 2001.
4 ARM, ARM Architecture Reference Manual (rev 0), 2005.
5 J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, 1990.
6 T. Ball and J. Laurs, "Branch prediction for free," in Proc. ACM SIGPLAN Conf. Programming Language Design Implementation, pp. 300-313, New York, U.S.A., Aug. 1993.
7 J. Patterson, "Accurate static branch prediction by value range propagation," in Proc. ACM SIGPLAN Conf. Programming Language Design Implementation, pp. 67-78, New York, U.S.A., June 1995.
8 B. Calder, D. Grunwald, M. Jones, D. Lindsay, J. Martin, M. Mozer, and B. Zorn, "Evidence-based static branch prediction using machine learning," ACM Trans. Programming Languages Syst., vol. 19, no. 1, pp. 1-43, Sep. 1996.
9 C. Cheng, The Schemes and Performances of Dynamic Branch Predictors, Technical Report, Berkeley Wireless Research Center, 2000.
10 T. Juan, S. Sanjeevan, and J. Navarro, "Dynamic history-length fitting : A third level of adaptivity for branch prediction," in Proc. Comput. Architecture, pp. 155-166, Barcelona, Spain, July 1998.
11 J. Lee and A. Smith, "Branch prediction strategies and branch target buffer design," Computer, vol. 17, no. 1, pp. 6-22, Jan. 1984.
12 R. Sendag, J. Yi, P. Chuang, and D. Lilja, "Low power/area branch prediction using complementary branch predictors," in Proc. IEEE Int. Parallel Distributed Process. Symp., pp. 1-12, Miami, U.S.A., Apr. 2008.
13 Y. Maa, M. Yen, S. Kuo, and G. Lee, "Cost-effective branch prediction by combining hedging and filtering," in Proc Int. Comput. Symp., pp. 648-655, Tainan, Taiwan, Dec. 2010.
14 L. Nadav and W. Shlomo, "Low power branch prediction for embedded application processors," in Proc. Low Power Electron. Design, pp. 67-72, Austin, U.S.A., Aug. 2010.
15 S. McFarling, Combining branch predictors, Technical Report, Western Research Laboratory, Dec. 1993.
16 Y. Ding and W. Zhang, "Loop-based instruction prefetching to reduce the worst-case execution time," IEEE Trans. Comput., vol. 59, no. 6, pp. 855-864, June 2010.   DOI   ScienceOn
17 M. Kobayashi, "Dynamic characteristics of loops," IEEE Trans. Comput., vol. 33, no. 2, pp. 125-132, Feb. 1984.
18 S. Segars, "The ARM9 family-High performance microprocessors for embedded applications," in Proc. Int. Conf. Comput. Design, pp. 230-235, Austin, U.S.A., Oct. 1998.
19 M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown, "MiBench: A free, commercially representative embedded benchmark suite," in Proc. IEEE Int. Workshops Workload Characterization, pp. 3-14, Austin, U.S.A., Dec. 2001.
20 K. Inoue, T. Ishihara, and K. Murakami, "Way-predicting set-associative cache for high performance and low energy consumption," in Proc. Int. Symp. Low Power Electron. Design, pp. 273-275, San Diego, U.S.A., Aug. 1999.
21 M. Calagos and Y. Chu, "Hybrid scheme for low-power set associative caches," Electron. Lett., vol. 48, no. 14, pp. 819-821, July 2012.   DOI   ScienceOn
22 K. Kedzierski, M. Moreto, F. Cazorla, and M. Valero, "Adapting cache partitioning algorithms to pseudo-LRU replacement policies," in Proc. Parallel Distributed Process, pp. 1-12, Atlanta, U.S.A., Apr. 2010.
23 T. Chen, P. Pan, G. Jiang, and M. Ye, "Record branch prediction : An optimized scheme for two-level branch predictors," in Proc. IEEE 14th Int. Conf. High Performance Comput. Commun., pp. 1526-1533, Liverpool, U.K., June 2012.
24 D. Parikh, K. Skadron, Y. Zhang, and M. Stan, "Power-aware branch prediction: Characterization and design," IEEE Trans. Comput., vol. 53, no. 2, pp. 168-186, Feb. 2004.   DOI   ScienceOn
25 N. Dutt and K. Choi, "Configurable processor for embedded computing," IEEE Comput., vol. 36, no. 1, pp. 120-123, Jan. 2003.
26 K. Choi and Y. Cho, "Recent trends in the SoC design methodology," Inst. Electron. Eng. Korea (IEEK) Mag., vol. 30, no. 9, pp. 17-27, Sep. 2003.