References
- J.L. Hennessy and D.A. Patterson, 'Computer architecture: A quantitative approach,' Morgan Kaufmann Publishers, 2nd Ed. 1996
- M. Johnson, Superscalar microprocessor design, Englewood Cliffs, N. J.: Prentice Hall, 1991
- F. Bodin and A. Seznec, 'Skewed associativity improves program performance and enhnces predictability,' IEEE Trans. Computers, vol. 46, no.5, pp. 530-544, May 1997 https://doi.org/10.1109/12.589219
- O. Temam, C. Fricker, and W. Jalby, 'Cache interference phenomena,' Proc. ACM SIGMETRICS, pp. 261-271, 1994 https://doi.org/10.1145/183019.183047
- R.A. Uhlig and T.N. Mudge, 'Trace-driven memory simulation: A survey,' ACM Computing Surveys, vol. 29, no. 2, pp. 129-170, June 1997 https://doi.org/10.1145/254180.254184
- H.J. Kim, S.M. Kim, and S.B. Choi, 'System performance analyses of out-of-order superscalar processors using analytical method,' IEICE Trans. Fundamentals of Electronics Communications and Computer Sciences, vol. E82-A, no. 6, pp. 927-938. June 1999
- A. Agarwal, M. Horowitz, and J. Hennessy, 'An analytical cache model,' ACM Trans. Computer Systems, vol. 7, no. 2, pp. 184-215, May 1989 https://doi.org/10.1145/63404.63407
- S. Coleman and K.S. McKinley, 'Tile size selection using cache organization and data layout,' Proc. SIGPLAN '95 Conf. Programming Language Design and Implementation, vol. 30, pp. 279-289, June 1995 https://doi.org/10.1145/207110.207162
- T. Fahringer, 'Automatic cache performance prediction in a parallelizing computer,' Proc. AICA '93-International Section, Sept. 1993
- C. Fricker, O. Temam, and W. Jalby, 'Influence of cross interferences on blocked loops: A case study with matrix-vector multiply,' ACM Trans. Programming Languages and Systems, vol. 17, no. 4, pp. 561-575, July 1995 https://doi.org/10.1145/210184.210185
- S. Ghost, M. Martonosi, and S. Malik, 'Cache miss equations: An analytical representation of cache misses,' Proc. 11th ACM Int'l Conf. Supercomputing, Vienna, Austria, July 1997
- M.S. Lam, E.E. Rothberg, and M.E. Wolf, 'The cache performance and optimizations of blocked algorithms,' Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 63-74, Santa Clara, Calif., 1991 https://doi.org/10.1145/106972.106981
- K.S. McKinley and O. Temam, 'A quantitative analysis of loop nest locality,' Proc. Seventh Conf. Architectural Support for Programming Languages and Operating Systems, vol. 7, Oct. 1996
- M.E. Wolf and M.S. Lam, 'A data locality optimizing algorithm,' Proc. SIGPLAN '91 Conf. Programming Language Design and Implementation, vol. 26, pp. 30-44, June 1991 https://doi.org/10.1145/113445.113449
- J.S. Harper, D.J. Kerbyson, and G.R. Nudd, 'Analytical modeling of set-associative cache behavior,' IEEE Trans. on Computers, vol. 48, no. 10, pp. 1009-1023, Oct. 1999 https://doi.org/10.1109/12.805152
- T.Y. Yeh, D.T. Marr, and Y.N. Patt, 'Increasing the instruction fetch rate via multiple branch prediction and a branch address cache,' Proc. Seventh ACM Int'l Conf. Supercomputing, pp. 67-76, Tokyo, July 1993 https://doi.org/10.1145/165939.165956
- S. Wallace and N. Bagherzadeh, 'Modeled and measured instruction fetching performance for superscalar microprocessors,' IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 6, pp. 570-578, June 1998 https://doi.org/10.1109/71.689444
- M.D. Smith, M. Johnson, and M.A. Horowitz, 'Limits on multiple instruction issue,' Proc. Third Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 290-302, Apr. 1989 https://doi.org/10.1145/68182.68209
- T.M. Conte, K.N. Meneszes, P.M. Mills, and B.A. Patel, 'Optimization of instruction fetch mechanisms for high issue rates,' Proc. 22nd Ann. Int'l Symp. Computer Architecture, pp. 333-344, June 1995 https://doi.org/10.1145/223982.224444
- G. Irlam, 'Spa' Personal Communication http://www.base.com/gordoni/spa/cat1/spy.1, 1995
- Standard Performance Evaluation Corporation, 'SPEC CPU95 benchmark,' http://www.specbench.org/osg/cpu95/, Mar. 1998