[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2020.05.008

Low-power Scheduling Framework for Heterogeneous Architecture under Performance Constraint

Li, Junke (College of Computer Science, Sichuan University)
Guo, Bing (College of Computer Science, Sichuan University)
Shen, Yan (School of Control Engineering, University of Information Technology)
Li, Deguang (College of Computer Science, Sichuan University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.5, 2020 , pp. 2003-2021 More about this Journal

Abstract

Today's computer systems are widely integrated with CPU and GPU to achieve considerable performance, but energy consumption of such system directly affects operational cost, maintainability and environmental problem, which has been aroused wide concern by researchers, computer architects, and developers. To cope with energy problem, we propose a task-scheduling framework to reduce energy under performance constraint by rationally allocating the tasks across the CPU and GPU. The framework first collects the estimated energy consumption of programs and performance information. Next, we use above information to formalize the scheduling problem as the 0-1 knapsack problem. Then, we elaborate our experiment on typical platform to verify proposed scheduling framework. The experimental results show that our proposed algorithm saves 14.97% energy compared with that of the time-oriented policy and yields 37.23% performance improvement than that of energy-oriented scheme on average.

Keywords

Energy saving; heterogeneous architecture; integer programming; resource allocation; scheduling framework;

Citations & Related Records

Reference

1	Augonnet, Cedric, et al, "StarPU: a unified platform for task scheduling on heterogeneous multicore architectures," Concurrency and Computation: Practice and Experience, 23(2), 187-198, 2011. DOI
2	Chi-Keung Luk, Sunpyo Hong, and Hyesoon Kim, "Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping," in Proc. of the 42nd International Symposium on Microarchitecture (MICRO). ACM, New York, NY, 45-55, 2009.
3	Belviranli, Mehmet E., Laxmi N. Bhuyan, and Rajiv Gupta, "A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures," ACM Transactions on Architecture and Code Optimization (TACO), 9(4), 57, 2013.
4	Grewe, Dominik, and Michael FP O'Boyle, "A static task partitioning approach for heterogeneous systems using OpenCL," in Proc. of International Conference on Compiler Construction. Springer Berlin Heidelberg, pp. 286-305, 2011.
5	Murilo Boratto, Pedro Alonso, Carla Ramiro, andMarcos Barreto, "Heterogeneous computational model for landform attributes representation on multicore and multi-GPU systems," Procedia Computer Science, 9, 47-56, 2012. DOI
6	Bernabe, Gregorio, Javier Cuenca, and Domingo Gimenez, "Optimization techniques for 3D-FWT on systems with manycore GPUs and multicore CPUs," Procedia Computer Science, 18, 319-328, 2013. DOI
7	Jimenez, Victor J., et al, "Predictive runtime code scheduling for heterogeneous architectures," International Conference on High-Performance Embedded Architectures and Compilers. Springer Berlin Heidelberg, pp. 19-33, 2009.
8	Hong, Sunpyo, and Hyesoon Kim, "An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness," ACM SIGARCH Computer Architecture News, Vol. 37. No. 3, 2009.
9	Wang, Haifeng, and Yunpeng Cao, "Predicting power consumption of GPUs with fuzzy wavelet neural networks," Parallel Computing, 44, 18-36, 2015. DOI
10	Hamano, Tomoaki, Toshio Endo, and Satoshi Matsuoka, "Power-aware dynamic task scheduling for heterogeneous accelerated clusters," Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on. IEEE, 2009.
11	Mark Silberstein and Naoya Maruyama, "An exact algorithm for energy-efficient acceleration of task trees on CPU/GPU architectures," in Proc. of the 4th Annual International Conference on Systems and Storage (SYSTOR'11). ACM, New York, NY, Article 7, pp. 1-7, 2011.
12	Jang, Jae Young, et al, "Workload-aware optimal power allocation on single-chip heterogeneous processors," IEEE Transactions on Parallel and Distributed Systems, 27(6), 1838-1851, 2016. DOI
13	Junke Li, Bing Guo, et al, "A Modeling Approach for Energy Saving Based on GA-BP Neural Network," Journal of Electrical Engineering and Technology, 11(5), 1289-1298, 2016. DOI
14	Paul, Indrani, et al, "Coordinated energy management in heterogeneous processors," Scientific Programming, 22(2), 93-108, 2014. DOI
15	Guohui Wang, Yong Guan, Yi Wang, and Zili Shao, "Energy-Aware Assignment and Scheduling for Hybrid Main Memory in Embedded Systems," Computing (Springer), Vol. 98, No. 3, pp. 279-301, 2016. DOI
16	Yi Wang, Hui Liu, Duo Liu, Zhiwei Qin, Zili Shao, and Edwin H.-M. Sha, "Overhead-Aware Energy Optimization for Real-Time Streaming Applications on Multiprocessor System-on-Chip," ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol. 16, No 2, pp. 14:1-14:32, March 2011.
17	Wenjie Liu, et al, "A waterfall model to achieve energy efficient tasks mapping for large scale GPU clusters," in Proc. of Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on. IEEE, 2011.
18	Khemka, Bhavesh, et al, "Utility functions and resource management in an oversubscribed heterogeneous computing environment," IEEE Transactions on Computers, 64(8), 2394-2407, 2015. DOI
19	Oxley, Mark A., et al, "Makespan and energy robust stochastic static resource allocation of a Bag-of-tasks to a heterogeneous computing system," IEEE Transactions on Parallel and Distributed Systems, 26(10), 2791-2805, 2015. DOI
20	Machovec, Dylan, et al, "Dynamic resource management for parallel tasks in an oversubscribed energy-constrained heterogeneous environment," in Proc. of Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE, 2016.
21	TOP500 List - November 2016, https://www.top500.org/list/2016/11/
22	Nadathur Satish, Mikhail Smelyanskiy, Srinivas Chennupaty, Per Hammarlund, Ronak Singhal, and Pradeep Dubey, "Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU," ACM SIGARCH Computer Architecture News, 38(3), 2010.
23	ChrisGregg and Kim Hazelwood, "Where is the data-Why you cannot debate CPU vs.GPU performance without the answer," in Proc. of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'11), 134-144, 2011.
24	Sparsh Mittal and Jeffrey S. Vetter, "A survey of methods for analyzing and improving GPU energyefficiency," ACM Computing Surveys, 47(2), 2014.
25	Isaac Gelado, John E. Stone, Javier Cabezas, Sanjay Patel, Nacho Navarro, and Wen-mei W. Hwu, "An asymmetric distributed shared memory model for heterogeneous parallel systems," ACM SIGARCH Computer Architecture News, 38(1), 347-358, March 2010. DOI
26	Qi Hu, Nail A. Gumerov, and Ramani Duraiswami, "Scalable fast multipole methods on distributed heterogeneous architectures," in Proc. of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, New York, NY, Article 36, pp. 1-12, 2011.
27	NOVEMBER 2016, https://www.top500.org/green500/lists/2016/11/
28	R. Kaleem, R. Barik, T. Shpeisman, B. Lewis, C. Hu, and K. Pingali, "Adaptive Heterogeneous Scheduling on Integrated GPUs," in Proc. of the 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT), pp.151-162, 2014.
29	Qiang Liu, and Wayne Luk, "Heterogeneous systems for energy efficient scientific computing," in Proc. of International Symposium on Applied Reconfigurable Computing. Springer Berlin Heidelberg, 2012.
30	Kai Ma, et al, "Greengpu: A holistic approach to energy efficiency in gpu-cpu heterogeneous architectures," in Proc. of 2012 41st International Conference on Parallel Processing. IEEE, 2012.
31	Barik, Rajkishore, et al, "A black-box approach to energy-aware scheduling on integrated CPU-GPU systems," in Proc. of the 2016 International Symposium on Code Generation and Optimization. ACM, pp. 70-81, 2016.
32	Guibin Wang, and Xiaoguang Ren, "Power-efficient work distribution method for CPU-GPU heterogeneous system," in Proc. of International Symposium on Parallel and Distributed Processing with Applications. IEEE, 2010.
33	Kai Ma, et al, "Energy conservation for GPU-CPU architectures with dynamic workload division and frequency scaling," Sustainable Computing: Informatics and Systems, 12, 21-33, 2016. DOI
34	Totoni, Ehsan, Mert Dikmen, and Maria Jesus Garzaran, "Easy, fast, and energy-efficient object detection on heterogeneous on-chip architectures," ACM Transactions on Architecture and Code Optimization (TACO), 10(4), 45, 2013.
35	Chandramohan, Kiran, and Michael FP O'Boyle, "Partitioning data-parallel programs for heterogeneous MPSoCs: time and energy design space exploration," ACM SIGPLAN Notices, Vol. 49. No. 5, 2014.
36	Choi, Hong Jun, et al, "An efficient scheduling scheme using estimated execution time for heterogeneous computing systems," The Journal of Supercomputing, 65(2), 886-902, 2013. DOI