참고문헌
- Yunheung Paek, Jay Hoeflinger, and David Padua, "Simplification of array access patterns for compiler optimizations", In PLDI'98, pages60-71.
- Jean-Francois Collard and Daniel Lavery, "Optimizations to prevent cache penalties for the intel Itanium 2 processor", In Proceedings of the CGO'03, 105-114.
- P. Grun, N. Dutt, and A. Nicolau, "Access pattern based local memory customization for low power embedded systems", In Proceedings of the conference on DATE, 778-784.
- M. Gupta and P. Banerjee, "Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputers", IEEE Trans. Parallel Distrib. Syst., 3(2):179-193, 1992. https://doi.org/10.1109/71.127259
- Hartej Singh, Guangming Lu, Eliseu Filho, Rafael Maestre, Ming-Hau Lee, Fadi Kurdahi, and Nader Bagherzadeh, "Morphosys: case study of a reconfigurable computing system targeting multimedia applications", In Proceedings of DAC, 573-578, 2000.
- M. Wolfe, "More iteration space tiling", In Proceedings of the ACM/IEEE conferenceon, Supercomputing'89, 655-664.
- Nainesh Agarwal and Nikitas Dimopoulos, "Dspstone benchmark of codel's automated clock gating platform", In Proceedings of the IEEE VLSI, 508-509, 2007.
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown, "Mibench: A free, commercially representative embedded benchmark suite", In Proceedings of the WWC-4. 2001.
- ICD-C compiler framework, University of Dortmund, .http://www.icd.de/es/icd-c/
- Yoonjin Kim, Mary Kiemb, Chulsoo Park, Jinyong Jung, and Kiyoung Choi, "Resource sharing and pipelining in coarse-grained reconfigurable architecture for domain-specific optimization", In Proceedings of DATE'05, 12-17.
- A. Hatanaka and N. Bagherzadeh, "A modulo scheduling algorithm for a coarse-grain reconfigurable array template", In Proceedings of the IPDPS'07, 1-8, 2007.
- Hyunchul Park, Kevin Fan, Manjunath Kudlur, and Scott Mahlke, "Modulo graph embedding: mapping applications onto coarse-grained reconfigurable architectures", In Proceedings of CASES'06, 136-146.
- Kathryn McKinley and Steve Carr, "Improving data locality with loop transformations", ACM Transactions on Programming Languages and Systems, 18: 424-453, 1996. https://doi.org/10.1145/233561.233564
- B. Mei, S. Vernalde, D. Verkest, H. De Man, and R. Lauwereins, "Adres: An architecture with tightly coupled vliw processor and coarse grained reconfigurable matrix", In Proceeding of Field Programmable Logic, FPL'03, 61-70.
- Michael Joseph Wolfe, "High Performance Compilers for Parallel Computing", Addison-Wesley Longman Publishing Co., USA, 1995.
- Wei Li, "Compiling for numa parallel machines", PhD thesis, Ithaca, NY, USA,1993.
- Michael E. Wolf and Monica S. Lam, "A data locality optimizing algorithm", In Proceedings of the ACM SIGPLAN 1991, 30-44.
- Michael E. Wolf, Dror E. Maydan, and Ding-Kai Chen, "Combining loop transformations considering caches and scheduling", In MICRO29, 274-286, 1996.
- Daniel Edward Lenoski, "The design and analysis of DASH: a scalable directory-based multiprocessor", PhD thesis, Stanford, CA, USA, 1992.
- Kai Li, "Shared virtual memory on loosely coupled multiprocessors", PhD thesis, 1986.
- S. Lumetta, L. Murphy, X. Li, D. Culler, and I. Khalil, "Decentralized optimal power pricing: The development of a parallel program", In IEEE Parallel and Distributed Technology, 240-249, 1993.
- V. Balasundaram and K. Kennedy, "A technique for summarizing data access and its use in parallelism enhancing transformations", In Proceedings of the ACM SIGPLAN 1989, 41-53.
- Chau wen Tseng, "Compiler optimizations for eliminating barrier synchronization", ACM SIGPLAN, 144-155, 1995.