Browse > Article
http://dx.doi.org/10.3837/tiis.2017.02.004

An Adaptively Speculative Execution Strategy Based on Real-Time Resource Awareness in a Multi-Job Heterogeneous Environment  

Liu, Qi (Nanjing University of Information Science and Technology)
Cai, Weidong (Nanjing University of Information Science and Technology)
Liu, Qiang (School of Computer, Hunan University of Technology)
Shen, Jian (Nanjing University of Information Science and Technology)
Fu, Zhangjie (Nanjing University of Information Science and Technology)
Liu, Xiaodong (School of Computing, Edinburgh Napier University)
Linge, Nigel (School of Computing, Science and Engineering, The University of Salford)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.11, no.2, 2017 , pp. 670-686 More about this Journal
Abstract
MapReduce (MRV1), a popular programming model, proposed by Google, has been well used to process large datasets in Hadoop, an open source cloud platform. Its new version MapReduce 2.0 (MRV2) developed along with the emerging of Yarn has achieved obvious improvement over MRV1. However, MRV2 suffers from long finishing time on certain types of jobs. Speculative Execution (SE) has been presented as an approach to the problem above by backing up those delayed jobs from low-performance machines to higher ones. In this paper, an adaptive SE strategy (ASE) is presented in Hadoop-2.6.0. Experiment results have depicted that the ASE duplicates tasks according to real-time resources usage among work nodes in a cloud. In addition, the performance of MRV2 is largely improved using the ASE strategy on job execution time and resource consumption, whether in a multi-job environment.
Keywords
MapReduce; Speculative Execution; Real-Time Resources Awareness; Multi-Job Distribution;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. Wood, L. Cherkasova, K. Ozonat, and P. Shenoy, "Profiling and modeling resource usage of virtualized applications," in Proc. of the 9th ACM/IFIP/USENIX International Conference on Middleware, pp. 366-387, 2008.
2 S. Islam, J. Keung, K. Lee and A. Liu, "Empirical prediction models for adaptive resource provisioning in the cloud," Future Generation Computer Systems, vol. 28, no. 1, pp. 155-162, 2012.   DOI
3 A. Matsunaga and J. Fortes, "On the use of machine learning to predict the time and resources consumed by applications," in Proc. of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 495-504, 2010.
4 Y. Wang and W. Shi, "Budget-Driven Scheduling Algorithms for Batches of MapReduce Jobs in Heterogeneous Clouds," IEEE Transactions on Cloud Computing, vol. 2, no. 3, pp. 306-319, 2014.   DOI
5 W. Yu, Y. Wang, X. Que and C. Xu, "Virtual shuffling for efficient data movement in mapreduce," IEEE Transactions on Computers, vol. 6, no. 1, pp. 556-568, 2015.
6 S. Tang, B. S. Lee and B. He, "DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters," IEEE Transactions on Cloud Computing, vol. 2, no. 3, pp. 333-347, 2014.   DOI
7 M. Zaharia, A. Konwinski, A. Joseph, R. Katz and I. Stoica, "Improving Mapreduce Performance in Heterogeneous Environments," OSDI, vol. 8, no. 4, 2008.
8 C. Qi, C. Liu and Z. Xiao, "Improving MapReduce performance using smart speculative execution strategy, " IEEE Transactions on Computers, vol. 63, no. 4, pp. 954-967, 2014.   DOI
9 Q. Liu, W. Cai, D. Jin, J. Shen, F. Zhang, X. Liu, and N. Linge, "Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment," Sensors, vol. 16, no. 9, pp. 1-15, 2016.   DOI
10 Z. Fu, X. Sun, Q. Liu, L. Zhou and J. Shu, "Achieving Efficient Cloud Search Services: Multi-keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing," IEICE Transactions on Communications, vol. E98B, no. 1, pp. 190-200, 2015.
11 Y. Kong, M. Zhang, and D. Ye, "A belief propagation-based method for task allocation in open and dynamic cloud environments," Knowledge-Based ystems, vol. 115, pp. 123-132, 2017.   DOI
12 M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski and M. Zaharia, "A view of cloud computing," Communications of the ACM, vol. 53, no. 4, pp. 50-58, 2010.   DOI
13 J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2008.   DOI
14 K. Anyanwu, H. S. Kim, P. Ravindra, "Algebraic Optimization for Processing Graph Pattern Queries in the Cloud," IEEE Internet Computing, vol. 99, no. 2, pp. 52-61, 2013.
15 C. Olston, B. Reed, U. Srivastava, R. Kumar and A. Tomkins, "Pig Latin: A Not-So-Foreign Language for Data Processing," in Proc. of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1099-1110, 2008.
16 A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu and R. Murthy, "Hive-a petabyte scale data warehouse using Hadoop," in Proc. of 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996-1005, 2010.
17 Q. Liu, W. Cai, J. Shen, X. Liu, and N. Linge, "An Adaptive Approach to Better Load Balancing in a Consumer-centric Cloud Environment," IEEE Transaction on Consumer Electronics, vol. 62, no. 3, pp. 243-250, 2016.   DOI
18 F. Ahmad, S. Chakradhar, A. Raghunathan and T. Vijaykumar, "Tarazu: optimizing MapReduce on heterogeneous clusters," ACM SIGARCH Computer Architecture News, pp. 61-74, 2012.
19 M. Dai, Z. Lu, D. Shen, H. Wang, B. Chen, X. Lin, S. Zhang, L. Zhang, and H. Liu, "Design of (4, 8) binary code with MDS and zigzag-decodable property," Wireless Personal Communications, vol. 89, no. 1, pp. 1-13, Jul. 2016.   DOI
20 M. Dai, C. W. Sung, H. Wang, X. Gong, and Z. Lu, "A new zigzag-decodable code with efficient repair in wireless distributed storage," IEEE Transaction on Mobile Computing, preprint, 2016.
21 P. J. Tai and J. Yan, "Computing resource prediction for mapreduce applications using decision tree," Web Technologies and Applications, pp. 570-577, 2012.
22 W. Fang, B. He, Q. Luo and N. K. Govindaraju, "Mars: accelerating mapreduce with graphics processors," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, pp. 608-620, 2010.   DOI
23 Y. Zhang, Q. Gao, L. Gao and C. Wang, "Priter: a distributed framework for prioritizing iterative computations," IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 9, pp. 1884-1893, 2014.   DOI
24 B. Palanisamy, A. Singh and L. Liu, "Cost-effective resource provisioning for mapreduce in a cloud," IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 5, pp. 1265-1279, 2015.   DOI
25 Y. Kwon, M. Balazinska, B. Howe and J. Rolia, "Skewtune: mitigating skew in mapreduce applications," in Proc. of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 25-36, 2012.
26 B. Gufler, N. Augsten, A. Reiser and A. Kemper, "Handling data skew in MapReduce," in Proc. of the 1st International Conference on Cloud Computing and Services Science (CLOSER), pp. 574-583, 2011.
27 Y. Fan, W. Wu, Y. Xu and H. Chen, "Improving MapReduce Performance by Balancing Skewed Loads," Communications, China, vol. 11, no. 8, pp. 85-108, 2014.
28 Q. Liu, W. Cai, J. Shen, Z. Fu, X. Liu, and N. Linge, "A speculative approach to spatial-temporal efficiency with multi-objective optimization in a heterogeneous cloud environment," Security and Communication Networks, preprint, 2016.