Low-power Scheduling Framework for Heterogeneous Architecture under Performance Constraint |
Li, Junke
(College of Computer Science, Sichuan University)
Guo, Bing (College of Computer Science, Sichuan University) Shen, Yan (School of Control Engineering, University of Information Technology) Li, Deguang (College of Computer Science, Sichuan University) |
1 | Augonnet, Cedric, et al, "StarPU: a unified platform for task scheduling on heterogeneous multicore architectures," Concurrency and Computation: Practice and Experience, 23(2), 187-198, 2011. DOI |
2 | Chi-Keung Luk, Sunpyo Hong, and Hyesoon Kim, "Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping," in Proc. of the 42nd International Symposium on Microarchitecture (MICRO). ACM, New York, NY, 45-55, 2009. |
3 | Belviranli, Mehmet E., Laxmi N. Bhuyan, and Rajiv Gupta, "A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures," ACM Transactions on Architecture and Code Optimization (TACO), 9(4), 57, 2013. |
4 | Grewe, Dominik, and Michael FP O'Boyle, "A static task partitioning approach for heterogeneous systems using OpenCL," in Proc. of International Conference on Compiler Construction. Springer Berlin Heidelberg, pp. 286-305, 2011. |
5 | Murilo Boratto, Pedro Alonso, Carla Ramiro, andMarcos Barreto, "Heterogeneous computational model for landform attributes representation on multicore and multi-GPU systems," Procedia Computer Science, 9, 47-56, 2012. DOI |
6 | Bernabe, Gregorio, Javier Cuenca, and Domingo Gimenez, "Optimization techniques for 3D-FWT on systems with manycore GPUs and multicore CPUs," Procedia Computer Science, 18, 319-328, 2013. DOI |
7 | Jimenez, Victor J., et al, "Predictive runtime code scheduling for heterogeneous architectures," International Conference on High-Performance Embedded Architectures and Compilers. Springer Berlin Heidelberg, pp. 19-33, 2009. |
8 | Hong, Sunpyo, and Hyesoon Kim, "An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness," ACM SIGARCH Computer Architecture News, Vol. 37. No. 3, 2009. |
9 | Wang, Haifeng, and Yunpeng Cao, "Predicting power consumption of GPUs with fuzzy wavelet neural networks," Parallel Computing, 44, 18-36, 2015. DOI |
10 | Hamano, Tomoaki, Toshio Endo, and Satoshi Matsuoka, "Power-aware dynamic task scheduling for heterogeneous accelerated clusters," Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on. IEEE, 2009. |
11 | Mark Silberstein and Naoya Maruyama, "An exact algorithm for energy-efficient acceleration of task trees on CPU/GPU architectures," in Proc. of the 4th Annual International Conference on Systems and Storage (SYSTOR'11). ACM, New York, NY, Article 7, pp. 1-7, 2011. |
12 | Jang, Jae Young, et al, "Workload-aware optimal power allocation on single-chip heterogeneous processors," IEEE Transactions on Parallel and Distributed Systems, 27(6), 1838-1851, 2016. DOI |
13 | Junke Li, Bing Guo, et al, "A Modeling Approach for Energy Saving Based on GA-BP Neural Network," Journal of Electrical Engineering and Technology, 11(5), 1289-1298, 2016. DOI |
14 | Paul, Indrani, et al, "Coordinated energy management in heterogeneous processors," Scientific Programming, 22(2), 93-108, 2014. DOI |
15 | Guohui Wang, Yong Guan, Yi Wang, and Zili Shao, "Energy-Aware Assignment and Scheduling for Hybrid Main Memory in Embedded Systems," Computing (Springer), Vol. 98, No. 3, pp. 279-301, 2016. DOI |
16 | Yi Wang, Hui Liu, Duo Liu, Zhiwei Qin, Zili Shao, and Edwin H.-M. Sha, "Overhead-Aware Energy Optimization for Real-Time Streaming Applications on Multiprocessor System-on-Chip," ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol. 16, No 2, pp. 14:1-14:32, March 2011. |
17 | Wenjie Liu, et al, "A waterfall model to achieve energy efficient tasks mapping for large scale GPU clusters," in Proc. of Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on. IEEE, 2011. |
18 | Khemka, Bhavesh, et al, "Utility functions and resource management in an oversubscribed heterogeneous computing environment," IEEE Transactions on Computers, 64(8), 2394-2407, 2015. DOI |
19 | Oxley, Mark A., et al, "Makespan and energy robust stochastic static resource allocation of a Bag-of-tasks to a heterogeneous computing system," IEEE Transactions on Parallel and Distributed Systems, 26(10), 2791-2805, 2015. DOI |
20 | Machovec, Dylan, et al, "Dynamic resource management for parallel tasks in an oversubscribed energy-constrained heterogeneous environment," in Proc. of Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE, 2016. |
21 | TOP500 List - November 2016, https://www.top500.org/list/2016/11/ |
22 | Nadathur Satish, Mikhail Smelyanskiy, Srinivas Chennupaty, Per Hammarlund, Ronak Singhal, and Pradeep Dubey, "Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU," ACM SIGARCH Computer Architecture News, 38(3), 2010. |
23 | ChrisGregg and Kim Hazelwood, "Where is the data-Why you cannot debate CPU vs.GPU performance without the answer," in Proc. of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'11), 134-144, 2011. |
24 | Sparsh Mittal and Jeffrey S. Vetter, "A survey of methods for analyzing and improving GPU energyefficiency," ACM Computing Surveys, 47(2), 2014. |
25 | Isaac Gelado, John E. Stone, Javier Cabezas, Sanjay Patel, Nacho Navarro, and Wen-mei W. Hwu, "An asymmetric distributed shared memory model for heterogeneous parallel systems," ACM SIGARCH Computer Architecture News, 38(1), 347-358, March 2010. DOI |
26 | Qi Hu, Nail A. Gumerov, and Ramani Duraiswami, "Scalable fast multipole methods on distributed heterogeneous architectures," in Proc. of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, New York, NY, Article 36, pp. 1-12, 2011. |
27 | NOVEMBER 2016, https://www.top500.org/green500/lists/2016/11/ |
28 | R. Kaleem, R. Barik, T. Shpeisman, B. Lewis, C. Hu, and K. Pingali, "Adaptive Heterogeneous Scheduling on Integrated GPUs," in Proc. of the 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT), pp.151-162, 2014. |
29 | Qiang Liu, and Wayne Luk, "Heterogeneous systems for energy efficient scientific computing," in Proc. of International Symposium on Applied Reconfigurable Computing. Springer Berlin Heidelberg, 2012. |
30 | Kai Ma, et al, "Greengpu: A holistic approach to energy efficiency in gpu-cpu heterogeneous architectures," in Proc. of 2012 41st International Conference on Parallel Processing. IEEE, 2012. |
31 | Barik, Rajkishore, et al, "A black-box approach to energy-aware scheduling on integrated CPU-GPU systems," in Proc. of the 2016 International Symposium on Code Generation and Optimization. ACM, pp. 70-81, 2016. |
32 | Guibin Wang, and Xiaoguang Ren, "Power-efficient work distribution method for CPU-GPU heterogeneous system," in Proc. of International Symposium on Parallel and Distributed Processing with Applications. IEEE, 2010. |
33 | Kai Ma, et al, "Energy conservation for GPU-CPU architectures with dynamic workload division and frequency scaling," Sustainable Computing: Informatics and Systems, 12, 21-33, 2016. DOI |
34 | Totoni, Ehsan, Mert Dikmen, and Maria Jesus Garzaran, "Easy, fast, and energy-efficient object detection on heterogeneous on-chip architectures," ACM Transactions on Architecture and Code Optimization (TACO), 10(4), 45, 2013. |
35 | Chandramohan, Kiran, and Michael FP O'Boyle, "Partitioning data-parallel programs for heterogeneous MPSoCs: time and energy design space exploration," ACM SIGPLAN Notices, Vol. 49. No. 5, 2014. |
36 | Choi, Hong Jun, et al, "An efficient scheduling scheme using estimated execution time for heterogeneous computing systems," The Journal of Supercomputing, 65(2), 886-902, 2013. DOI |