[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.6109/jicce.2018.16.4.264

Workflow Scheduling Using Heuristic Scheduling in Hadoop

Thingom, Chintureena (Centre for Advanced Research & Training, CHRIST (Deemed to be University))
Kumar R, Ganesh (Faculty of Engineering, CHRIST (Deemed to be University))
Yeon, Guydeuk (Innovation Centre, CHRIST (Deemed to be University))

Publication Information

Journal of information and communication convergence engineering / v.16, no.4, 2018 , pp. 264-270 More about this Journal

Abstract

In our research study, we aim at optimizing multiple load in cloud, effective resource allocation and lesser response time for the job assigned. Using Hadoop on datacenter is the best and most efficient analytical service for any corporates. To provide effective and reliable performance analytical computing interface to the client, various cloud service providers host Hadoop clusters. The previous works done by many scholars were aimed at execution of workflows on Hadoop platform which also minimizes the cost of virtual machines and other computing resources. Earlier stochastic hill climbing technique was applied for single parameter and now we are working to optimize multiple parameters in the cloud data centers with proposed heuristic hill climbing. As many users try to priorities their job simultaneously in the cluster, resource optimized workflow scheduling technique should be very reliable to complete the task assigned before the deadlines and also to optimize the usage of the resources in cloud.

Keywords

Cloud; Data Centers; Hadoop; Heuristic hill climbing; Workflow scheduling;

Citations & Related Records

Reference

1	D. de Oliveira, E. Ogasawara, K. Ocana, F. Baiao, and M. Mattoso, "An adaptive parallel execution strategy for cloud - based scientific workflows," Concurrency and Computation: Practice and Experience, vol. 24, no. 13, pp. 1531-1550, 2012. DOI: 10.1002/cpe.1880. DOI
2	K. Deng, L. Kong, J. Song, K. Ren, and D. Yuan, "A weighted k-means clustering based co-scheduling strategy towards efficient execution of scientific workflows in collaborative cloud environments," in Proceedings of 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC), Sydney, Australia, pp. 547-554, 2011. DOI: 10.1109/DASC.2011.102.
3	C. Olston, G. Chiou, L. Chitnis, F. Liu, Y. Han, M. Larsson, et al., "Nova: continuous Pig/Hadoop workflows," in Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, Athens, Greece, pp. 1081-1090, 2011. DOI: 10.1145/1989323.1989439.
4	F. Dong and S. G. Akl, "PFAS: a resource-performance-fluctuation-aware workflow scheduling algorithm for grid computing," in Proceedings of IEEE International Parallel and Distributed Processing Symposium, Rome, Italy, pp. 1-9, 2007. DOI: 10.1109/IPDPS.2007.370328.
5	J. Wang, D. Crawl, and I. Altintas, "Kepler+Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems," in Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, Portland, OR, 2009. DOI: 10.1145/1645164.1645176.
6	Z. Tang, M. Liu, A. Ammar, K. Li, and K. Li, "An optimized MapReduce workflow scheduling algorithm for heterogeneous computing," The Journal of Supercomputing, vol. 72, no. 6, pp. 2059-2079, 2016. DOI: 10.1007/s11227-014-1335-2. DOI
7	K. R. Krish, A. Anwar, and A. R. Butt, "[phi]Sched: a heterogeneity-aware Hadoop workflow scheduler," in Proceedings of 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), Paris, France, pp. 255-264, 2014. DOI: 10.1109/MASCOTS.2014.40.
8	X. Xu, L. Cao, and X. Wang, "Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters," IEEE Systems Journal, vol. 10, no. 2, pp. 471-482, 2016. DOI: 10.1109/JSYST.2014.2323112. DOI
9	Q. Chen, D. Zhang, M. Guo, Q. Deng, and S. Guo, "SAMR: a self-adaptive Mapreduce scheduling algorithm in heterogeneous environment," in Proceedings of 2010 IEEE 10th International Conference on Computer and Information Technology (CIT), Bradford, UK, pp. 2736-2743, 2010. DOI: 10.1109/CIT.2010.458.
10	D. Crawl, J. Wang, and I. Altintas, "Provenance for Mapreduce-based data-intensive workflows," in Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, Seattle, WA, pp. 21-30, 2011. DOI: 10.1145/2110497.2110501.
11	D. de Oliveira, K. A. Ocana, F. Baiao, and M. Mattoso, "A provenance-based adaptive scheduling heuristic for parallel scientific workflows in clouds," Journal of Grid Computing, vol. 10, no. 3, pp. 521-552, 2012. DOI: 10.1007/s10723-012-9227-2. DOI
12	C. Wensel, "Cascading: defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster," 2008 [Internet], Available: https://github.com/cwensel/cascading.
13	M. Islam, A. K. Huang, M. Battisha, M. Chiang, S. Srinivasan, C. Peters, A. Neumann, and A. Abdelnur, "Oozie: towards a scalable workflow management system for Hadoop," in Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, Scottsdale, AZ, 2012. DOI: 10.1145/2443416.2443420.