Performance Analyses of Instruction Fetch Models Considering Cache Miss and Branch Misprediction

Kim, Seon-Mo;Jeong, Jin-Ha;Choe, Sang-Bang;

Journal of KIISE:Computer Systems and Theory (한국정보과학회논문지:시스템및이론)

Volume 28 Issue 12
/
Pages.685-697
/
2001
/
1229-683X(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Performance Analyses of Instruction Fetch Models Considering Cache Miss and Branch Misprediction

캐쉬 미스와 분기예측 실패를 고려한 명령어 페치 모델의 성능분석

Kim, Seon-Mo ;
Jeong, Jin-Ha (Dept.of Electronics Engineering, Inha University) ;
Choe, Sang-Bang (Scool of Electrical and Computer Engineering, Inha University)

김선모 ((주)LG 전자 연구원) ;
정진하 (인하대학교 전자공학과) ;
최상방 (인하대학교 전자전기컴퓨터공학부)

Published : 2001.12.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Cache memories are small fast memories used to temporarily hold the contents of main memory that are likely to be referenced by processors so as to reduce instruction and data access time. In this paper, we represent analytical models of instruction fetch process for four types of instruction cache structures that can be used for superscalar processors. In the models, we define various kinds of architectural parameters and take cache miss and branch misprediction into consideration. To prove the correctness of the proposed models, we performed extensive simulations and compared the results with the analytical models. Simulation results showed that the proposed model can estimate the instruction fetch rate accurately within 10% error in most cases. Both analytical model and simulation show that the increase of cache misses reduces the instruction fetch rate more severely than that of branch misprediction does. However, the analytical model can explain the causes of performance degradation which cannot be uncovered by the simulation method only. The model is also able to provide exact relationship between cache miss and branch misprediction for instruction fetch analysis.

캐쉬 메모리는 명령어와 데이터의 참조시간을 줄이기 위하여 프로세서에 의해 참조되어질 가능성이 높은 주 메모리의 내용을 일시적으로 저장하는 용량이 작고 빠른 메모리이다. 본 논문에서는 슈퍼스칼라 프로세서에 적용될 수 있는 네 가지 명령어 캐쉬 구조에 대하여 캐쉬 미스와 분기예측 실패를 고려한 해석적 모델을 제안하고 성능을 분석하였다. 슈퍼스칼라 구조의 다양한 파라미터들을 정의하여 명령어 페치를 모델링하였으며, 해석적 모델의 타당성을 검증하기 위하여 시뮬레이션을 수행하여 얻은 결과와 비교하였다. 명령어 페치율에 있어서는 분기예측 실패로 인한 영향보다는 캐쉬 미스로 인한 성능저하가 더욱 큰 것으로 나타났다. 본 연구를 통하여 얻은 해석적 모델을 사용하면 시뮬레이션에서는 드러나지 않는 성능제약의 원인에 대한 명확한 규명이 가능하며, 캐쉬 성능에 있어서 캐쉬 미스와 분기예측 실패간의 관계에 대한 정확한 분석이 가능하다.

Keywords

References

J.L. Hennessy and D.A. Patterson, 'Computer architecture: A quantitative approach,' Morgan Kaufmann Publishers, 2nd Ed. 1996
M. Johnson, Superscalar microprocessor design, Englewood Cliffs, N. J.: Prentice Hall, 1991
F. Bodin and A. Seznec, 'Skewed associativity improves program performance and enhnces predictability,' IEEE Trans. Computers, vol. 46, no.5, pp. 530-544, May 1997 https://doi.org/10.1109/12.589219
O. Temam, C. Fricker, and W. Jalby, 'Cache interference phenomena,' Proc. ACM SIGMETRICS, pp. 261-271, 1994 https://doi.org/10.1145/183019.183047
R.A. Uhlig and T.N. Mudge, 'Trace-driven memory simulation: A survey,' ACM Computing Surveys, vol. 29, no. 2, pp. 129-170, June 1997 https://doi.org/10.1145/254180.254184
H.J. Kim, S.M. Kim, and S.B. Choi, 'System performance analyses of out-of-order superscalar processors using analytical method,' IEICE Trans. Fundamentals of Electronics Communications and Computer Sciences, vol. E82-A, no. 6, pp. 927-938. June 1999
A. Agarwal, M. Horowitz, and J. Hennessy, 'An analytical cache model,' ACM Trans. Computer Systems, vol. 7, no. 2, pp. 184-215, May 1989 https://doi.org/10.1145/63404.63407
S. Coleman and K.S. McKinley, 'Tile size selection using cache organization and data layout,' Proc. SIGPLAN '95 Conf. Programming Language Design and Implementation, vol. 30, pp. 279-289, June 1995 https://doi.org/10.1145/207110.207162
T. Fahringer, 'Automatic cache performance prediction in a parallelizing computer,' Proc. AICA '93-International Section, Sept. 1993
C. Fricker, O. Temam, and W. Jalby, 'Influence of cross interferences on blocked loops: A case study with matrix-vector multiply,' ACM Trans. Programming Languages and Systems, vol. 17, no. 4, pp. 561-575, July 1995 https://doi.org/10.1145/210184.210185
S. Ghost, M. Martonosi, and S. Malik, 'Cache miss equations: An analytical representation of cache misses,' Proc. 11th ACM Int'l Conf. Supercomputing, Vienna, Austria, July 1997
M.S. Lam, E.E. Rothberg, and M.E. Wolf, 'The cache performance and optimizations of blocked algorithms,' Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 63-74, Santa Clara, Calif., 1991 https://doi.org/10.1145/106972.106981
K.S. McKinley and O. Temam, 'A quantitative analysis of loop nest locality,' Proc. Seventh Conf. Architectural Support for Programming Languages and Operating Systems, vol. 7, Oct. 1996
M.E. Wolf and M.S. Lam, 'A data locality optimizing algorithm,' Proc. SIGPLAN '91 Conf. Programming Language Design and Implementation, vol. 26, pp. 30-44, June 1991 https://doi.org/10.1145/113445.113449
J.S. Harper, D.J. Kerbyson, and G.R. Nudd, 'Analytical modeling of set-associative cache behavior,' IEEE Trans. on Computers, vol. 48, no. 10, pp. 1009-1023, Oct. 1999 https://doi.org/10.1109/12.805152
T.Y. Yeh, D.T. Marr, and Y.N. Patt, 'Increasing the instruction fetch rate via multiple branch prediction and a branch address cache,' Proc. Seventh ACM Int'l Conf. Supercomputing, pp. 67-76, Tokyo, July 1993 https://doi.org/10.1145/165939.165956
S. Wallace and N. Bagherzadeh, 'Modeled and measured instruction fetching performance for superscalar microprocessors,' IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 6, pp. 570-578, June 1998 https://doi.org/10.1109/71.689444
M.D. Smith, M. Johnson, and M.A. Horowitz, 'Limits on multiple instruction issue,' Proc. Third Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 290-302, Apr. 1989 https://doi.org/10.1145/68182.68209
T.M. Conte, K.N. Meneszes, P.M. Mills, and B.A. Patel, 'Optimization of instruction fetch mechanisms for high issue rates,' Proc. 22nd Ann. Int'l Symp. Computer Architecture, pp. 333-344, June 1995 https://doi.org/10.1145/223982.224444
G. Irlam, 'Spa' Personal Communication http://www.base.com/gordoni/spa/cat1/spy.1, 1995
Standard Performance Evaluation Corporation, 'SPEC CPU95 benchmark,' http://www.specbench.org/osg/cpu95/, Mar. 1998

Journal of KIISE:Computer Systems and Theory (한국정보과학회논문지:시스템및이론)

Performance Analyses of Instruction Fetch Models Considering Cache Miss and Branch Misprediction

캐쉬 미스와 분기예측 실패를 고려한 명령어 페치 모델의 성능분석

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)