DOI QR코드

DOI QR Code

Optimizing Instruction Prefetching to Improve Worst-Case Performance for Real-Time Applications

  • Ding, Yiqiang (Department of Electrical and Computer Engineering Southern Illinois University Carbondale) ;
  • Yan, Jun (Department of Electrical and Computer Engineering Southern Illinois University Carbondale) ;
  • Zhang, Wei (Department of Electrical and Computer Engineering Southern Illinois University Carbondale)
  • Published : 2009.03.31

Abstract

While the average-case performance is important for general-purpose applications, worst-case performance is crucial for real-time systems to ensure schedulability and reliability. Recent work has shown that simple prefetching techniques such as the Next-N-Line prefetching can benefit both average-case and worst-case performance; however, the improvement on the worstcase execution time (WCET) is rather limited and inefficient. This paper presents two instruction prefetching approaches that are specially designed to enhance the worst-case performance, including the loop-based prefetching and WCET-oriented prefetching. Our experiments indicate that both instruction prefetching techniques can achieve better worst-case execution cycles than the Next-N-Line prefetching while having various impacts on the average-case performance.

Keywords

References

  1. Homepage of snu real-time benchmarks. http://archi.snu.ac.kr/realtime/benchmark/.
  2. Hompage of ait worst-case execution time analyzers. http://www.absint.com/ait/.
  3. Trimaran homepage. http://www.trimaran.org.
  4. ARNOLD, R., MULLER, F., WHALLEY, D., AND HARMON, M. 1994. Bounding worst-case instruction cache performance. Proc. of the Real-Time Systems Symposium. https://doi.org/10.1109/REAL.1994.342718
  5. BATCHER, K. AND WALKER, R. 2006. Interrupt triggered software prefetching for embedded cpu instruction cache. Proc. of RTAS. https://doi.org/10.1109/RTAS.2006.24
  6. CHEN, K., MALIK, S., AND AUGUST, D. 2001. Retargetable static timing analysis for embedded software. Proc. of ISSS. https://doi.org/10.1109/ISSS.2001.156529
  7. CHOW, P., HAMMARLUND, P., AAMODT, T., MARCUELLO, P., AND WANG, H. 2004. Hardware support for prescient instruction prefetch. Proc. of International Symposium on High Performance Computer architecture. https://doi.org/10.1109/HPCA.2004.10028
  8. FERDINAND, C. AND WILHELM, R. 1998. On predicting data cache behavior for real-time systems. Proc. of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded System.
  9. HEALY, C., WHALLEY, D., AND HARMON, M. 1995. Integrating the timing analysis of pipelining and instruction caching. Proc. of the Real-Time Systems Symposium. https://doi.org/10.1109/REAL.1995.495218
  10. JOSEPH, D. AND GRUNWALD, D. 1997. Prefetching using markov predictors. Proc. of ISCA. https://doi.org/10.1109/ISCA.1997.604695
  11. KATHAIL, V., SCHLANSKER, M., AND RAU, B. 2000. Hpl-pd architecture specification: version 1.1. HPL Technical Report.
  12. LEE, M., MIN, S., AND KIM, C. 1994. A worst case timing analysis technique for instruction prefetch buffers. Microprocessing and Microprogramming.
  13. LI, Y. AND MALIK, S. 1995. Performance analysis of embedded software using implicit path enumeration. Proc. of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Real-Time Systems. https://doi.org/10.1109/DAC.1995.249991
  14. LI, Y., MALIK, S., AND WOLFE, A. 1995. Efficient microarchitecture modeling and path analysis for real-time software. Proc. of the 16th Real-Time Systems Symposium. https://doi.org/10.1109/REAL.1995.495219
  15. LUK, C. AND MOWRY, T. 1998. Cooperative prefetching: compiler and hardware support for effective instruction prefetching in modern processors. Proc. of MICRO. https://doi.org/10.1109/MICRO.1998.742780
  16. MUCHNIC, S. 1997. Advanced compiler design and implementation. Morgan Kaufmann Publishers.
  17. MUELLER, F. 1997. Generalizing timing predictions to set-associative caches. Proc. of Euromicro Workshop on Real-Time Systems. https://doi.org/10.1109/EMWRTS.1997.613765
  18. PANDA, P., DUTT, N., AND NICOLAU, A. 1997. Memory data organization for improved cache performance in embedded processor applications. ACM Transactions on Design Automation and Electronics Systems 4.
  19. PIERCE, J. AND MUDGE, T. 1996. Prefetching in supercomputer instruction caches. Proc. of MICRO.
  20. RAMAPRASAD, H. AND MUELLER, F. 2005. Bounding worst-case data cache behavior by analytically deriving cache reference paerns. Proc. of the 11th IEEE Real-Time and Embedded Technology and Applications Symposium. https://doi.org/10.1109/RTAS.2005.12
  21. REINMAN, G., CALDER, B., AND AUSTIN, T. 1999. Fetch directed instruction prefetching. Proc. of the 32nd International Symposium on Microarchitecture.
  22. SMITH, A. 1978. Sequential program prefetching in memory hiearchies. IEEE Computer 2:7-21. https://doi.org/10.1109/C-M.1978.218016
  23. SMITH, A. 1982. Cache memories. Computing surveys 3:473-530.
  24. SMITH, J. AND HSU, W. 1992. Prefetching in supercomputer instruction caches. Supercomputing.
  25. SRINIVASAN, V., DAVIDSON, E., TYSON, G., CHARNEY, M., AND PUZAK, T. 2001. Branch history guided instruction prefetching. Proc. of the 7th International Conference on High Performance Computer Architecture (HPCA).
  26. STASCHULAT, J. AND ERNST, R. 2006. Worst case timing analysis of input dependent data cache behavior. Proc. of the 18th Euromicro Conference on Real-Time Systems (ECRTS06).
  27. WILHELM, R., ENGBLOM, J., ERMEDAHL, A., HOLSTI, N., THESING, S., WHALLEY, D., BERNAT, G., FERDINAND, C., HECKMAN, R., MITRA, T., MUELLER, F., PUAUT, I., PUSCHNER, P., STASCHULAT, J., AND STENSTROM., P. 2008. The worst case execution time problem - overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems 3:1-53.
  28. XIA, C. AND TORRELLAS, J. 1996. nstruction prefetching of systems codes with layout optimized for reduced cache misses. Proc. of the International Symposium on Computer Architecture.
  29. YAN, J. AND ZHANG, W. 2007. Wcet analysis of instruction caches with prefetching. Proc. of ACM SIGPLAN/SIGBED 2007 Conference on Languages, Compilers, and Tools for Embedded Systems.

Cited by

  1. Combining Instruction Prefetching with Partial Cache Locking to Improve WCET in Real-Time Systems vol.8, pp.12, 2013, https://doi.org/10.1371/journal.pone.0082975