DOI QR코드

DOI QR Code

Defining and Discovering Cardinalities of the Temporal Workcases from XES-based Workflow Logs

  • Yun, Jaeyoung (Div. of Computer Science and Engineering, Kyonggi University) ;
  • Ahn, Hyun (Div. of Computer Science and Engineering, Kyonggi University) ;
  • Kim, Kwanghoon Pio (Div. of Computer Science and Engineering, Kyonggi University)
  • Received : 2018.12.29
  • Accepted : 2019.04.30
  • Published : 2019.06.30

Abstract

Workflow management system is a system that manages the workflow model which defines the process of work in reality. We can define the workflow process by sequencing jobs which is performed by the performers. Using the workflow management system, we can also analyze the flow of the process and revise it more efficiently. Many researches are focused on how to make the workflow process model more efficiently and manage it more easily. Recently, many researches use the workflow log files which are the execution history of the workflow process model performed by the workflow management system. Ourresearch group has many interests in making useful knowledge from the workflow event logs. In this paper we use XES log files because there are many data using this format. This papersuggests what are the cardinalities of the temporal workcases and how to get them from the workflow event logs. Cardinalities of the temporal workcases are the occurrence pattern of critical elements in the workflow process. We discover instance cardinalities, activity cardinalities and organizational resource cardinalities from several XES-based workflow event logs and visualize them. The instance cardinality defines the occurrence of the workflow process instances, the activity cardinality defines the occurrence of the activities and the organizational cardinality defines the occurrence of the organizational resources. From them, we expect to get many useful knowledge such as a patterns of the control flow of the process, frequently executed events, frequently working performer and etc. In further, we even expect to predict the original process model by only using the workflow event logs.

Keywords

1. Introduction

Workflow management system is a system that manages the workflow which defines the process of work in reality. We can define the workflow process by sequencing jobs which is related and influencing to each other. Using the workflow management system, we can also analyze the flow of the process and revise it more efficiently. Many researches are focused on how to make the workflow process modelmore efficiently and manage it more easily. Recently,workflow process model is becoming larger and larger as the process is more complex.Also the workflow management system(WFMS) becomes larger due to the growth of computing power and techniques to design it[1]. Thus, it is an issue that to analyze a new type ofrequirements and demands concerning about workflowintelligence and quality.

Our research group focuses on the knowledge discovered from the execution logs of workflow process models to improve the quality of them with high consistent andefficiency. We called this behavior - discovering new knowledge from the workflow execution logs - processmining. There are workflow process mining technology invarious perspectives such as control-flow, data-flow, performand organizational resource.

In this paper, we introduce the instance cardinality, activity cardinality and organizational resource cardinality. Instance cardinality is the occurrence of the process instances in the event log file. Activity cardinality is the occurrence of the activities in the process instances. And organizational resource cardinality is the occurrence of the organizational resources in the process instances. Each cardinality has their own viewpoint which is needed to the process manager who has to analyze entirely nasty, large process. More details about these cardinalities are in chapter 3. In chapter 4, we show the examples of these cardinalities discovered from thereal event logs. We used XES-based event logs in our experiments.

2. Related Works

2.1 XES

XES[3][4][5] is a standard format to define the workflowevent log which is an execution history of the process model. It is an XML-based text file. Basically, it has three part using the XML tags. Log, trace, event are those. Log is the toplevel component of the XES. Log defines the whole process which was performed before making the XES log file. It is a top level of the XES file. The XES file begins with ‘’and ends with ‘’. In the log, there are many traces which define the instances of workflow process - the log. Each trace defines the instance which is the execution of the process. In a trace, there are many events which defines the activities in the instance. Each event defines the exeuction of an activity which is the unit of job in the process. As an example, there is a process of hand linginsurance claims. The log defines the process -hand linginsurance claims. The trace defines one specific instance of the process - one specific insurance claim. The event defines the execution of a activity -recording the client & rsquo;spersonal information in the database has been completed. Log, trace and event have their own attributes to define the information about themselves.Moreover, Each attribute contains information by key with value. Key is the domaintype of attributes. Therefore, it is easy to define and represent the execution history by using XES format.

2.2 Process Mining

Process mining means to find useful information from the workflow models. Basically, there are two resources in making valuable information through workflow management, first is the workflow model[6][7][8][9][10][11][12][13][14][15] and second is the workflow log[16][17][18][19]. Recently, there are many interests in using workflow logs - we called it process mining-for example, there are some researches try to find out workflow models through workflow log and process mining from workflow log to getinformation. Process mining becomes important because ituses the real data which is execution data of the workflow process models.In this paper, we also use the workflow logs and we define and show some information got from the XES workflow logs i.e. our research is a kind of process mining.

3. Types of the Cardinalities

In this chapter, we introduce the definition of temporal workcases cardinalities using these information.

There are three temporal workcases cardinalities that can be discovered from XES-based workflow event logs and each cardinality has three types. Instance cardinality defines the occurrence of the workflow process instances which meansthe executed workflow process model. Instance cardinality can be divided into three types, instance cardinality perworkflow model, instance class cardinality per workflow model and instance cardinality per instance class. Instance cardinality per workflow model defines how many times the workflow process instances occur in an execution of workflow process model. Instance class cardinality perworkflow model defines how many types of the workflow process instances occur in an execution of workflow processmodel which is same as the redundancy removed instance cardinality per workflow model. Instance cardinality perinstance class defines how many times the workflow control-paths occur in an execution of workflow processmodel. It means that what types of the activities in the workflow process instances without redundancy. The activity cardinality and the organizational resource cardinality each defines the occurrence of the activities, organizational resources. They also have three types each and their types have the same context as that in the instance cardinality. There is a definition of these temporal workcasescardinalities in [Definition 1].

[Definition 1] Temporal workcases cardinalities. A temporal workcases cardinalities that can be discovered from XES-based workflow process event logs.

OTJBCD_2019_v20n3_77_f0001.png 이미지

(Fig. 1) A Workflow Warehouse of 50 Workflow Instances corresponding to 10 Workflow Processes

Instance Cardinality

  • Instance Cardinality per Workflow Model: The number of process instance for each workflow process model in the workflow execution log.
  • Instance Class Cardinality per Workflow Model: The number of process instance’s control-path for each workflow process model in the workflow execution log.
  • Instance Cardinality per Instance Class: The number of process instance’s control-path frequency for each workflow process model in the workflow execution log.

Activity Cardinality

  • Activity Instance Cardinality per Workflow Model: The number of activity frequency that makes up each workflow process model in the workflow execution log.
  • Activity Class Cardinality per Workflow Model: The number of activity that makes up each workflow process model in the workflow execution log.
  • Activity Instance Cardinality per Instance per Workflow Model : The number of activity frequency that makes up each workflow instance in for each workflow process model inthe workflow execution log. Organizational resource Cardinality

Organizational resource Cardinality

  • Organizational resouce Instance Cardinality per Workflow Model: The number of organizational resource frequency that makes up each workflow process model in the workflow execution log.
  • Organizational resource Class Cardinality per Workflow Model: The number of organizational resource that makes up each workflow process model in the workflow execution log.
  • Organizational resource Instance per Instance per Workflow Model: The number of organizational resource frequency that makes up each workflow instance in for each workflow process model in the workflow execution log.

4. Experimental Results

We have made temporal workcases cardinalities from 10XES event logs. BPI Challenge 2012, BPI Challenge 2015, Hospital Log, Large Bank Transaction Process, Receipt Phase of an Environmental Permit Application Process and Review Example Large are them. BPI Challenge 2012 is alog of loan application process, BPI Challenge 2015 is a logof building permit applications over a period of approximately four years. Other XES log files follow theirnames. Each log has its own attributes to contain the information of the process and has large amount of executionhistory.

(Table 1) Instance Cardinality and Instances Class Cardinality in 10 Workflow Process Logs

OTJBCD_2019_v20n3_77_t0001.png 이미지

OTJBCD_2019_v20n3_77_f0002.png 이미지

(Fig. 2) 50 Instance Cardinality per Instance Class in BPI Challenge 2012

Fig 1 shows a workflow warehouse of 50 workflow instances discovered from the 10 workflow process log files. It shows how many events - execution of activities - in each of 50 workflow instances in the workflow log. In Fig1, we can see that the large bank transaction process has about 60 events and in the hospital log there is a variation in number of events among the instances.

Table 1 shows the instance cardinality and instance classcardinality for each 10 workflow processes. The instance cardinality shows how many traces in the log file and the instance class cardinality shows how many types of control-path in the log file. In table 1, we can see that BPIChallenge 2012, RP-EPAP and Review Example Large havemany duplicated control-path whereas LBTP has noduplicated control-path in other words, every control-path is different. Fig 2 shows the 50 instance cardinality perinstance class in BPI Challenge 2012. In fig 2, we can seethat the instance cardinality class number 4 and number 14 has many duplicated instances in BPI Challenge 2012 They takes up about 38% of total instances. With this kind ofinformation - the instance cardinality -we can figure out the proportion of loop cases or disjunctive cases.

Table 2 shows the activity cardinality discovered from 10 workflow process log files. In Table 1 and 2, the logs with a small number of activity take a big proportion in duplicated instance class. However, in LBTP, there are small number of activity, it does not have any duplicated instance class. Fig3 shows theactivity instance cardinality in BPI Challenge 2012 log file. It shows the number of activity frequency of each activity. It is clear that the two activities which names are ‘W_Completeren aanvraag’ and ‘W_Nabellenoffertes & rsquo; were performed many times than other activities in BPIChallenge 2012. Fig 4 is an activity instance cardinality perinstance of BPI Challenge 2012 log file. In Fig 4, we cansee that there are many duplicated number of activities whichare ‘3 = W_Completeren aanvraag’(yellow), ‘9 =W_Nabellen offertes’(brown) and ‘22 = W_Nabellenincomplete dossiers’(dark-blue) among the 30 instances. ‘No.25 ’instance even has activity 22 which is almost 50 times occur. With activity cardinality, we can get which activity is mostly executed and how many times of it occurs. With the activity cardinality, we can figure out which activity is mostly performed and that activity probably is the most important.

(Table 2) Activity Cardinality in 10 Workflow Process Logs

OTJBCD_2019_v20n3_77_t0002.png 이미지

OTJBCD_2019_v20n3_77_f0003.png 이미지

(Fig. 3) Activity Instance Cardinality in the BPI Challenge 2012

OTJBCD_2019_v20n3_77_f0004.png 이미지

(Fig. 4) 30 Activity Instance Cardinality per Instance of BPI Challenge 2012

OTJBCD_2019_v20n3_77_f0005.png 이미지

(Fig. 5) Organizational Resource Frequency of 50 Workflow Instances Corresponding to 9 Workflow Processes

Fig 5 shows the number of classes of organizational resourcesin each event traces. In Fig. 5, there are 9 workflow process log files which are little different from the log files of the instance cardinality and activity cardinality. It& rsquo;sbecause that Large Bank Transaction Process is a simulated log, so its organizational resource information is set to ‘null & rsquo;. Thus we didn’t use Large Bank Trasaction Process log files and use 9 other workflow process log files. Using the organizational resource cardinality, we can see that howmany organizations are related to each event trace and whichorganization is mostly participated in the process. In fig 5, we can see that there is a clear variance in organizational resources among the workflow instances in BPI Challenge 2012 and Hospital Log. In RP-EPAP, we can see that very small number of orgnizations are participated in the process. Using these organizational resource cardinality, we can figureout how many organizations are participated in the processand which is mostly working. Furthermore, with thisinformation, we can analyze who is taking charge in the joband who is consuming the time in the job.

5. Conclusions and Future Work

In this paper, we defined the instance cardinality, activity cardinality and organizational resource cardinality i.e. temporal workcase cardinality. Instance cardinality defines the occurrence of the workflow process instances which means the executed workflow process model. The activity cardinality and the organizational resource cardinality each defines the occurrence of the activities, organizational resources. We visualized the result of the discovery oftemporal workcases cardinalities from the workflow eventlogs. With these cardinalities, we expect toget many usefulknowledge such as a patterns of the control flow of the process, frequently executed events, frequently working performer and etc. In further, we even expect to predict the original process model by only using the workflow event logs.

References

  1. Hyung-Jin Ahn, and Kwang-Hoon Kim, "Design and Implementation of a Very Large-Scale Workflow Management System," Journal of Internet Computing and Services, vol. 10, no. 6, pp. 205-218, 2009.
  2. WfMC, "Business Process Analytics Format (BPAF)", Workflow Management Coalisation Workflow Standard Document Number WfMC-TC-1015, 2008.
  3. IEEE, Draft Standard for XES - eXtensible Event Stream - for achieving interoperability in event logs and event streams.
  4. Christian W. Gunther and Eric Verbeek, "XES Standard Definition", Technische Universiteit Eindhoven University of Technology, Netherlands, 2014.
  5. Acampora, Giovanni, et al., "IEEE 1849: The XES Standard: The Second IEEE Standard Sponsored by IEEE Computational Intelligence Society [Society Briefs]", IEEE Computational Intelligence Megazine, 12.2: 4-8, 2017. https://doi.org/10.1109/MCI.2017.2670420
  6. Daniela Grigori, Fabio Casati, Malu Castellanos, Umeshwar Dayal, Mehmet Sayal and Ming-Chien Shan, "Business Process Intelligence,"JOURNAL OF COMPUTERS IN INDUSTRY, Vol. 53, Issue 3, 2004.
  7. Fabio Casati, et al, "Business Process Intelligence," Technical Report, HPL-2002-119, HP Laboratories Palo Alto, 2002.
  8. Kwang-Hoon Kim and Clarence A. Ellis, "Workflow Reduction for Reachable-path Rediscovery in Workflow Mining,"Series of Studies in Computational Intelligence: the Foundations and Novel Approaches in Data Mining, Vol. 9, pp. 289-310, Springer, 2006.
  9. Aalst, W.P.M., Alves de Medeiros, A.K., and Weijters, A.J.M.M., "Process Equivalence: Comparing Two Process Models Based on Observed Behavior", BPM2006, Lecture Notes in Computer Science, Vol. 4012, pp. 129-144, 2006.
  10. Ahn, H. and Kim, K., "A Stochastic Activity-to-Performer Affiliation Binding Formalism in ICN-based Workflow Models," ICICI Express Letters, Vol. 9, No. 12, 2015.
  11. Min-Joon Kim, Hyun Ahn and Minjae Park, "A GraphML-based Visualization Framework for Workflow-Performersツ。ツッCloseness Centrality Measurements," KSII Transactions on Internet and Information Systems, vol. 9, no. 8, pp. 3216-3230, 2015. https://doi.org/10.3837/tiis.2015.08.028.
  12. Min-Joon Kim, Hyun Ahn and Min-Jae Park, "A Theoretical Framework for Closeness Centralization Measurements in a Workflow-Supported Organization," KSII Transactions on Internet and Information Systems, vol. 9, no. 9, pp. 3611-3634, 2015. https://doi.org/10.3837/tiis.2015.09.018.
  13. Haksung Kim, Hyun Ahn and Kwanghoon Pio Kim, "Modeling, Discovering, and Visualizing Workflow Performer-Role Affiliation Networking Knowledge," KSII Transactions on Internet and Information Systems, vol. 8, no. 2, pp. 689-706, 2014. https://doi.org/10.3837/tiis.2014.02.022.
  14. Jawon Kim, Hyun Ahn, Minjae Park, Sangguen Kim and Kwanghoon Pio Kim, "An Estimated Closeness Centrality Ranking Algorithm and Its Performance Analysis in Large-Scale Workflow-supported Social Networks," KSII Transactions on Internet and Information Systems, vol. 10, no. 3, pp. 1454-1466, 2016. https://doi.org/10.3837/tiis.2016.03.031.
  15. Daeyong Jung, Taeweon Suh, Heonchang Yu and JoonMin Gil, "A Workflow Scheduling Technique Using Genetic Algorithm in Spot Instance-Based Cloud," KSII Transactions on Internet and Information Systems, vol. 8, no. 9, pp. 3126-3145, 2014. https://doi.org/10.3837/tiis.2014.09.010.
  16. Pham Dinhlam, Hyun Ahn, and Kwanghoon Pio Kim, "A Temporal Work Transference Discovery Algorithm and Experimental Results on XES-Formatted Workflow Logs", 2018.
  17. Kim Sang-Bae, Kim Hak-Seong, and Paik Su-Ki, "A Workcase Mining Mechanism using Activity Dependency," Journal of Internet Computing and Services, vol. 4, no. 6, pp. 43-56, 2003.
  18. Min Jun-Ki, Kim Kwang-Hoon, and Chung Jung-Su, "A Control Path Analysis Mechanism for Workflow Mining," Journal of Internet Computing and Services, vol. 7, no. 1, pp. 91-100, 2006.
  19. Kyoungsook Kim, Moonsuk Yeon, Byeongsoo Jeong and Kwanghoon Kim, "A Conceptual Approach for Discovering Proportions of Disjunctive Routing Patterns in a Business Process Model," KSII Transactions on Internet and Information Systems, vol. 11, no. 2, pp. 1148-1161, 2017. https://doi.org/10.3837/tiis.2017.02.030.