Browse > Article
http://dx.doi.org/10.7472/jksii.2019.20.1.87

A MapReduce-Based Workflow BIG-Log Clustering Technique  

Jin, Min-Hyuck (Dept. of Computer Science, Graduate School, Kyonggi Univ.)
Kim, Kwanghoon Pio (Division of Computer Science and Engineering, Kyonggi Univ.)
Publication Information
Journal of Internet Computing and Services / v.20, no.1, 2019 , pp. 87-96 More about this Journal
Abstract
In this paper, we propose a MapReduce-supported clustering technique for collecting and classifying distributed workflow enactment event logs as a preprocessing tool. Especially, we would call the distributed workflow enactment event logs as Workflow BIG-Logs, because they are satisfied with as well as well-fitted to the 5V properties of BIG-Data like Volume, Velocity, Variety, Veracity and Value. The clustering technique we develop in this paper is intentionally devised for the preprocessing phase of a specific workflow process mining and analysis algorithm based upon the workflow BIG-Logs. In other words, It uses the Map-Reduce framework as a Workflow BIG-Logs processing platform, it supports the IEEE XES standard data format, and it is eventually dedicated for the preprocessing phase of the ${\rho}$-Algorithm that is a typical workflow process mining algorithm based on the structured information control nets. More precisely, The Workflow BIG-Logs can be classified into two types: of activity-based clustering patterns and performer-based clustering patterns, and we try to implement an activity-based clustering pattern algorithm based upon the Map-Reduce framework. Finally, we try to verify the proposed clustering technique by carrying out an experimental study on the workflow enactment event log dataset released by the BPI Challenges.
Keywords
workflow process mining; structured information control nets; workflow process enactment event logs; temporal workcase; temporal worktransference; XES event stream data format; Hadoop MapReduce Framework;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 W. M. P. van der Aalst and A. J. M. M. Weijters, "Process mining: a research agenda," Journal of Computers in Industry, Vol. 53, Issue 3, 2004.
2 Kyoungsook Kim, et al., "A Conceptual Approach for Discovering Proportions of Disjunctive Routing Patterns in a Business Process Model," KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, Vol. 11, No. 2, pp. 1148-1161, 2017.   DOI
3 Kim, Kwanghoon and Ellis, Clarence A., "$\sigma$-Algorithm: Structured Workflow Process Mining Through Amalgamating Temporal Workcases," The Proceedings of PAKDD2007, Advances in Knowledge Discovery and Data Mining, Lecture Notes in Artificial Intelligence, Vol. 4426, pp. 119-130, 2007.
4 IEEE, "IEEE Standard for eXtensible Event Stream (XES) for Achieving Interoperability in Event Logs and Event Streams," IEEE 1849-2016, 2016. https://doi.org/10.1109/IEEESTD.2016.7740858   DOI
5 Kim, Kwanghoon, "A XML-BasedWorkflow Event Logging Mechanism for Workflow Mining," The Proceedings of the International Workshop on APWeb, pages 132-136, 2006.
6 Minjae Park and Kwanghoon Kim, "XWELL: A XML-Based Workflow Event Logging Mechanism and Language for Workflow Mining Systems," Lecture Notes in Computer Science, Vol. 4707, pp. 900-909, 2007.
7 Michael zur Muehlen and Keith D. Swenson, "BPAF: A Standard for the Interchange of Process Analytics Data," Lecture Notes in Business Information Processing, Vol. 66, pp. 170-181, 2011.
8 Kim, Kyoungsook, Lee, Youngkoo, Ahn, Hyun., and Kim, Kwanghoon, "An Experimental Mining and Analytics for Discovering Proportional Process Patterns from Workflow Enactment Event Logs," Proceedings of the International Conference on Big Data Technologies and Applications, Exeter, England, Great Britain, Sept. 4rd-5th, 2018.
9 Kwanghoon Kim, "A Model-Driven Workflow Fragmentation Framework for Collaborative Workflow Architectures and Systems," Journal of Network and Computer Applications, Volume 35, Issue 1, pp. 97-110, January 2012.   DOI
10 K. Lee, Y. Lee, H. Choi, Y. F. Chung and B. Moon, "Parallel Data Processing with MapReduce: A Survey," SIGMOD Record, Vol. 40, No. 4, pp. 11-20, December 2011.   DOI
11 C. Goncalves, L. Assuncao, j. C. Cunha, "Flexible MapReduce Workflows for Cloud Data Analytics," International Journal of Grid and High Performance Computing, Vol. 5, No. 4, pp. 48-64, 2013.   DOI
12 BPI Challenge 2012, 2013, 2014, 2015, 2016, 2017, 2018, 4TU.Centre for Research Data, https://data.4tu.nl/repository/collection:event-logs-real.
13 K.-H. Lee, W.J. Park, K.S. Cho, W.Ryu, "The MapReduce framework for Large-scale Data Analysis: Overview and Research Trends," Electronics and telecommunications trends, vol. 28, No. 6, pp. 156-166, 2013. http://dx.doi.org/10.22648/ETRI.2013.J.280616   DOI
14 Kim KH., Ahn HJ., "An EJB-Based Very Large Scale Workflow System and Its Performance Measurement," In: Fan W., Wu Z., Yang J. (eds) Advances in Web-Age Information Management. WAIM 2005, Lecture Notes in Computer Science, Vol. 3739. pp. 526-535, Springer, Berlin, Heidelberg, 2005.
15 Minjae Park, Hyun, Ahn, and Kwanghoon Pio Kim, "Workflow-supported social networks: Discovery, analyses, and system," Journal of Network and Computer Applications, Vol, 75, pp. 355-373, Nov. 2016.   DOI