Support is the metric that describes the number of independent observations that are stored in the event log, i.e., the support of a log refers to the number of cases logged. From the point of view of the mining algorithms, a high level of supported is desired. However, in real situations, this is not always the case. The influence that this has on the quality and the correctness of the results of a mining algorithm should, thus, be taken into consideration.
The level of detail of an event log captures the average number of distinct events per trace. Event logs characterized by a high level of detail increase the complexity of the process model discovered by the mining algorithms, which subsequently has a negative impact on the readability of the diagram.
Table 6.1 presents the values of the three selected structural log metrics for the hospital and the municipality event logs. Compared to the municipality log, the hospital event log is characterized by large values for the magnitude and level of detail. The municipality log contains 374 cases, while the hospital one 1,143, information which is given by the support values.
Table 6.1: Structural log metrics
Event Log Magnitude Support Level of Detail
Hospital 150,291 1,143 113.12
Municipality 9,174 374 3.59
6.2
Hospital Event Log
The hospital event log originates from a Dutch Academic Hospital. As other healthcare processes, the process that the event log originates from, is highly unstructured, flexible, and depicts a large amount of different behavior suggested by the 981 possible paths. Given this situation, discovering a process model that covers all the observed behavior and then being able to analyze it, is not feasible. However, the process can be analyzed from different perspectives and one can gain insights into it, by applying various types of process mining techniques. Our purpose is, therefore, to discover what information about the process can be found out using ARIS PPM and Futura Reflect.
As previously stated, analyzing the complete behavior observed in the event log is not a realistic expectation, since the resulting process would be a “spaghetti process”. However, one can get a grasp on the control flow perspective of the process, by analyzing the process model corresponding to the most frequent behavior.
ARIS PPM provides the possibility of filtering the less frequent arcs connecting the activities in the activity sequence representation. This is done based on a threshold set by the user. Figure 6.1 depicts the activity sequence obtained by filtering out all the arcs with a frequency less than 1,000. In the case in which all arcs connected to an activity are removed, the activity is removed as well from the diagram, which reduces the models complexity. If, however, a loose activity is kept in the model (e.g. “vervolgconsult poliklinisch”), it is because the activity is involved in a self loop. The approach used for reducing the complexity of the process is, thus, based on removing individual arcs based on frequency. This implies the fact that the complete flow of cases is not preserved, as only the most dominant following relations become visible.
In Futura Reflect filtering out the infrequent behavior can be done by using the “Unique paths” filter, with the activity name as the attribute used to compute the paths. Figure 6.2 represents the model discovered by the Explore miner based on the most 20% frequent cases. The approach used by Futura
60 CHAPTER 6. DISCUSSION
Figure 6.1: ARIS PPM - Simplified model of the hospital process containing only the arcs with frequen- cies greater than 1,000
Reflect is, therefore, a case-based one. However, for situations in which all cases are unique, the cases included in the analysis would be randomly selected. The quality of the simplified model in such complex situations cannot be, thus, trusted.
The different approaches employed by ARIS PPM and Futura Reflect for reducing the complexity of a process model are also reflected in the resulting models. The two simplified processes (Figures 6.1 and 6.2) consist of completely different set of activities.
Both systems are capable of discovering the social network of resources based on the handovers of work. In this healthcare process there are 43 different organizational units involved. Its complete social network diagram, with all the units and all the connections between them, would be unreadable and impossible to use for any analysis. This is why there is the need for mechanisms that provide less complex versions, based on a filtered event log or only on a subset of resources.
Figure 6.3 represents the social network of organizational units displayed in ARIS PPM by using the same principle of filtering out the connections with a frequency less than 1,000. Unlike the activity sequence representation of the process model, where an activity was also deleted from the diagram if all arcs connected to it were removed, all organizational units are kept in the network. The default layout of the social network is with the nodes displayed on a circle, but the user can rearrange them or aggregate multiple nodes based on the communications structure. Additional insights into the resource perspective of the process can be gained by looking at the matrix made of activities and organizational units. In this case, as well, the less frequent mappings activity-organizational unit can be left out by setting a threshold value. The social network depicted in Figure 6.3 suggests the fact that the clinical chemistry department is frequently interacting with the other departments involved in the process.
6.2. HOSPITAL EVENT LOG 61
Figure 6.2: Futura Reflect - Simplified model of the hospital process built based on the most 20% frequent cases
Figure 6.3: ARIS PPM - Simplified social network of the hospital process containing only the relations with frequencies greater than 1,000
62 CHAPTER 6. DISCUSSION
built only on the 20% most frequent cases. The result is depicted in Figure 6.4. The second option is to generate the social network based on the handovers of work between the resources that perform activities most frequently. This is also done by using the “Unique paths” filter, but the name of the originator needs to be set as the attribute used for computing the paths. Figure 6.5 shows the resulting network. The two social networks provide different types of insights into the process (the first one by having the frequent control flow behavior as starting point and the second one by considering the most frequent behavior based on the resource perspective), which is also suggested by the fact that their diagrams have few similarities.
Figure 6.4: Futura Reflect - Simplified social network of the hospital process built based on the most 20% frequent cases
Figure 6.5: Futura Reflect - Simplified social network of the hospital process built based on the resources that perform activities most frequently
The hospital process could not be analyzed using the other two evaluated tools. ProcessAnalyzer produced a “System out of memory” exception when mining the event log and we were not able to import the event log in Flow due to the 65536 lines limit specific to Microsoft Office Excel 2003 files.