Browse > Article

Preprocessing Methods for Effective Modulo Scheduling on High Performance DSPs  

Cho, Doo-San (서울대학교 전기공학과)
Paek, Yun-Heung (서울대학교 전기컴퓨터공학부)
Abstract
To achieve high resource utilization for multi-issue DSPs, production compiler commonly includes variants of iterative modulo scheduling algorithm. However, excessive cyclic data dependences, which exist in communication and media processing loops, unduly restrict modulo scheduling freedom. As a result, replicated functional units in multi-issue DSPs are often under-utilized. To address this resource under-utilization problem, our paper describes a novel compiler preprocessing strategy for effective modulo scheduling. The preprocessing strategy proposed capitalizes on two new transformations, which are referred to as cloning and dismantling. Our preprocessing strategy has been validated by an implementation for StarCore SC140 DSP compiler.
Keywords
compiler; software pipelining; high performance multi-issue DSP; iterative modulo scheduling;
Citations & Related Records
연도 인용수 순위
  • Reference
1 V. Sarkar: Optimized Unrolling of Nested Loops. In International Journal of Parallel Programming, Vol.29, No.5, Oct 2001   DOI
2 D. Lavery and W. Hwu: Unrolling-Based Optimizations for Modulo Scheduling. In Proceedings of the 28th annual international symposium on Microarchitecture, page 327-337, 1995   DOI
3 J. Sias, H. Hunter, and W. Hwu: Enhancing loop buffering of media and telecommunications applications using low-overhead predication. In Proceedings of the 34th Annual International Symposium on Microarchitecture, Dec 2001   DOI
4 Green Hills: Embedded software development tools-StarCore Family. www.ghs.com/product/starcore development.html
5 E. Tan and W. Heinzelman: DSP architectures: past, present and futures. In ACM SIGARCH Computer Architecture News, Vol.31, No.3, pages 6-19, June 2003   DOI
6 Code Warrior for StarCore DSP. http://www.testechelect.com/metrowerks/starcore.html
7 E. Stotzer and E. Leiss: Modulo Scheduling for the TMS320C6x VLIW DSP Architecture. In Proceedings of the SIGPLAN'99 Workshop on Languages, Compilers, and Tools for Embedded Systems, May 1999   DOI
8 Cosy DSP Compiler Development System. http://www.ace.nl/compiler/DS CoSy DSP.pdf
9 B. Rau: Iterative modulo scheduling. In HP Laboratories Technical Report, HPL94115, Nov 1995
10 B. Rau and D. Glaser: Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the 14 Ann Microprogramming Workshop, Nov 1981
11 V. Zivojnovic, J. Velarde, C. Schager, and H. Meyr: DSPStone -A DSP oriented Benchmarking Methodology. In Proceedings of International Conference on Signal Processing Applications and Technology, 1994
12 Texas Instrument, Inc.: Code generation tools: compile tools: compile tools for TMS320C6000 and TMS320C5000 DSPs. http://dspvillage.ti.com
13 StarCore, Inc.: SC140 DSP Core Reference Manual. Atlanta, GA, 2001
14 D. Cho, Y. Paek: Instruction Reselection for Iterative Modulo Scheduling on Multi Issue DSPs. In Proceedings of the 1th international workshop on embedded software optimization, Aug 2006
15 H. Allan, B. Jones, M. Lee, J. Allan: Software Pipelining. In ACM Computing Surveys, Vol.27, No.3, Sep 1995   DOI
16 J. Eyre and J. Bier: The Evolution of DSP Processors. In A BDTI White Paper, ACM SIGARCH Computer Architecture News, 2000
17 Blackfin processor compiler and code density. http://www.analog.com
18 R. Huff: Lifetime-Sensitive Modulo Scheduling. In Proceedings of the SIGPLAN'93 Conference on Programming Language Design and Implementation, June, 1993   DOI
19 C. Lee, M. Potkonjak, and W. Smith: MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems. In Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture, Nov 1997   DOI
20 M. Lam: Software pipelining: an effective scheduling technique for VLIW machines. In Proceedings of the SIGPLAN'88 Conference on Programming Language Design and Implementation, June, 1988   DOI
21 T. Halfhill: Motorola enhances StarCore DSP, SC140e core offers new instructions, caches, and task protection. In Microprocessor Report, INSTAT/MDR, www.MPRonline.com. Oct 20, 2003
22 P. Labsley, J. Bier, and E. Lee: DSP Processor Fundamentals -Architecture and Features, In IEEE Press, ISBN 0-78033-405-1, 1996. Sep 1996
23 S. Muchnick: Advanced Compiler Design Implementation. In Morgan Kaufmann Publishers, ISBN 1-55860-320-4, 1997
24 TMS320C6000 Optimizing C Compiler Tutorial. In Literature Number: SPRU425A, August 2002
25 J. Tiernan: An efficient search algorithm to find the elementary circuits of a graph. In Communications of the ACM, pages 12-35, Dec 1970   DOI
26 G.R.Uh, Y. Wang, D. Whalley, and et al.: Compiler Transformations for Effectively Exploiting Zero Overhead Loop Buffer. Software-Practice & Experience, Vol 35, pages 393-412, 2005   DOI   ScienceOn
27 G. Tyson, M. Smelyanskiy. and E. Davidson: Evaluating the Use of Register Queues in Software Pipelined Loops. In IEEE Transactions on Computers, Vol.50, No.8, pages 769-783, Oct 2001   DOI   ScienceOn