Capítulo 1: Entender el envejecimiento
1.2. Envejecimiento: definiciones y características
There are several techniques and approaches that have been implemented to do process mining such as Alpha miner, Heuristic miner and Inductive miner. According to [9], the difference between process mining techniques mainly relies on the adopted process discovery method and how this method addresses two major aspects; representational bias and noise and incomplete- ness.
The first aspect is representational bias, which refers to the capability of a modelling repre- sentation language to represent various process structures. Basically , there are a number of modelling languages that can express and model processes such as Petri net, which is the most prominent one in ProM tool, BPMN (Business Process Modelling Notation), heuristic net that is known as causal nets or (C-nets) and process tree. For more detail about these languages see [51]. Every one of these languages has its own strengths and limitations however they all try to demonstrate high expressiveness of various process structures. Process structures, some- times called constructs, represent the type of relation between events. A sample of fundamental process structures are outlined in Table 2.1.
Table 2.1: Examples of fundamental process structures
Process structure Definition Synonym
Sequence event x is followed by event y -
Parallel event x is followed by some
events for instance event y and event z regardless of their order
AND , fork
Choice event x is followed by at least one or more of events; event y, event z or both y and z
OR
Exclusive Choice event x is followed by either event y or event z
XOR Loop Event x is followed by event x for
at least one time
iteration, cycle
It should be noted that, there is a workflow pattern initiative1, which is supported by Eind-
23 2.5. General process mining techniques
hoven University of Technology and Queensland University of Technology, it aims to identify all possible process topologies or structures.
These topologies are captured within process mining and workflow management research and the initiative has categorised them based on their relevant mining perspective for instance, process structures that are captured when mining control-flow, data or resource perspectives. This initiative helps in understanding the representational bias of various modelling languages.
The second aspect relates to noise and assumptions of incompleteness. Methods that are designed for process discovery are presumed to have mechanism to cope with noise and incom- pleteness of event log. Noise in this context, as defined by Wil van der Aalst in his book [9], means infrequent process instance in event log that is very dissimilar of the mainstream pro- cess. While incompleteness is related to the ability of a process discovery method to discover a generalizable model that can reflect all process instances presented in event log and other possible processes that are very similar but might be missing in the log.
2.5.1
Alpha miner
Alpha miner [13] was the first process mining algorithm that attempted to discover process model and to bridge the knowledge gap between event log and business modelling. The main idea of Alpha miner is to scan all process instances of event log to find possible relations between events and use them to build what is called “footprint” matrix. The basic algorithm of Alpha miner was able to detect only four basics process structures which are direct follow relation (event ‘a’ is followed directly by event ‘b’) no direct follow relation (event ‘a’ is never followed directly by event ‘b’), dependency relation which is a special type of direct follow (event ‘a’ is followed directly by event ‘b’ and event ‘b’ is never followed directly by event ‘a’) hence, there is a dependency relation between ‘a’ and ‘b’, and the last relation that is used in Alpha miner is the parallel relation that is discussed above.
Alpha miner uses Petri net modelling language that supports simple and understandable nota- tions. Alpha miner has been improved upon several times and extended to include advanced process structures for instance, loop and XOR.
Although Alpha miner can produce a simple and understandable model in the form of Petri net, model quality was low in terms of fitness and precision metrics, which will be discussed later in this chapter. Also,the Alpha miner cannot handle infrequent process instances since all process instances are used to build the “footprint” matrix, that is used for building a process model.
2.5.2
Heuristic miner
Heuristic miner [52] was designed to deal with noise and incompleteness in event logs. It focuses on extracting the relation between events for instance, finding the dependency of two events.
Constructing a process model using a Heuristic miner can be achieved through three main steps. The first step involves extracting dependency and frequency information between events (for example find the frequency of event ‘a’ when it is followed by event ‘b’). The second step requires the construction a graph based on dependency and frequency information. In other words, some rules are derived from the dependency and frequency information based on a predefined threshold of relations occurrence. This step shows how Heuristic miner can deal with noise, infrequent process instance, of event logs. The third step calls for the design of a process model based on the second step. Heuristic miner uses causal nets, C-nets, as a representation modelling language.
Further improvement of Heuristic miner is implemented using time perspective to construct a causal dependency matrix. For instance, some sequences of a log have event ‘a’ that is followed by event ‘b’ but in other sequence event ‘b’ occurred before event ‘a’ is finished. This means there is no causal dependency between event ‘a’ and ‘b’. Although Heuristic miner tried to eliminate low frequent sequences, noise, it is impractical miner because it may generate unreadable model for logs with high number of events. In addition of generating unsound model, which violates model quality where the model has a fired transition that cannot reach the end of the process. The overall quality of process models generated by the Heuristic miner depends highly on the configuration of removing infrequent sequence that affect on the extracted relations of dependency [53].
2.5.3
Inductive miner
Inductive miner [54] was created to explore process through the support of different configu- rations of process model. It was also developed to cope with model unsoundness, which is a major limitation in Heuristic miner [55]. It applies divide and conquer technique by splitting the log into sub-log recursively. This can be done by finding a proper cut-off relations such as sequence, exclusive choice and parallel. Moreover, the Inductive miner supports a visualization of deviated sequences with its frequency besides a number of process instances filtering tech- niques. Models that are discovered by the Inductive miner are built in the form of a process tree, which in turn can be converted to a Petri net.
The most advantageous feature of the Inductive miner is the ability to replay all process in- stances which guarantees high model fitness. On the other hand, it has a limited number of cut-off relations and a problematic representation of long dependency between events. From our experience and based on [55], in the case of large event logs which might have high variable process instances, the Inductive miner may generate a useless model where imprecise cut-off points can be found. Hence, the discovered model will be in the form of ‘flower model’, as described by [55], where all transitions between events are allowed.