Capítulo 3. Las Universidad de los Mayores
4. Hipótesis
misation are not necessarily Pareto optimal solutions in regard to the search space. In this research we call objectives as criteria, which are discussed in the following section.
5.3
Proposed criteria for model selection
In multi-objective function each function correspond to a particular criteria that is significant in model selection. Despite the exploratory nature of our experiments in Chapter 4, the findings offered some insights into desirable model characteristics. We identify four key criteria that play significant roles in the selection of the best process model which are; linearity, state com- pactness, cross state similarity and state importance. The rationale of proposed these criteria and calculation methods for them are discussed below.
5.3.1
Rationale behind the proposed criteria
In order to avoid or mitigate issues found in models that were selected by BIC, we need to understand possible factors that may control such issues. We are motivated by adopting the same principles of the ideal model that was selected in experiment 1 in Chapter 4 due to its good characteristics for modelling processes as discussed earlier.
Investigating the structure of the hidden states of our desirable model has revealed important insights into the effect of increasing the number of states to a model. Figure 5.3 shows the projection of hidden states over sequences for models with different number of states. We used the event log that has generated the ideal model with 3 states and it is characterised in Table 4.1. It has simple tractable processes as shown in 5.3 (a) Accident and Emergency fictional observations (A&E) which help us in analysing the process qualitatively. An iterative decoding using the Viterbi algorithm is applied starting from 2 hidden states to 9 hidden states which is the total number of distinct events and presented in Figure 5.3 (b - i).
(a) Accident and Emergency room fictional processes
(b) 2state model (c) 3state model
(d) 4state model (e) 5state model
(f) 6state model (g) 7state model
(h) 8state model (i) 9state model
91 5.3. Proposed criteria for model selection
In Figure 5.3 (b), two hidden states are used which results in a bad process modelling. Too few states leads to a model of high inverse transition rate (unstable model) where a process changes rapidly from one state to another. Moreover, state 1 and state 2 contain groups of highly different events, for example events occurred in state1; arrive , seen by clinician, request a bed, admission to ward and also discharge. This makes both states have a high variance which contributes to under-fitting the model.
In Figure 5.3 (c), three states are used for training. This model represents the natural flow of the process which helps in providing a good segmentation of the process into blocks of events. The model is stable which results in a linear flow of the process. There are no overlapping events between states which leads to desirable variance in all states.
In Figure 5.3 (d), the model was trained using four hidden states. Although this model seems to be stable and provides a good segmentation of the process, it has a production state which is state 1. A production state, as defined earlier, is a state has a single event type. This type of state is characterised with a very low state variance.
The phenomena of production state is expected here since we deal with a small scale event log and it might be an indication of over-fitting.
In Figure 5.3 (e - i), the model was trained by 5,6,7,8 and 9 hidden states respectively. The phenomena of production state is growing in all models which leads to highly non-preferred over-fitting models.
5.3.2
The proposed criteria
Taking the previous analysis into consideration, we propose several criteria that may work as control factors. These factors contribute effectively in characterising a good model. The suggested criteria are linearity, state compactness, cross state similarity and state importance.
(a) Linearity
Linearity can be defined as the sequential flow of the process where staying in the same state is accepted but no inverse flow is allowed. Linear HMM may include both well known structures ‘left-to-right’ or ‘right-to-left’ . More linearity means less inverse transition is preferable. The linearity principle is natural idea with the intuitive understanding of healthcare processes. In other words, a patient is exposed to a series of healthcare steps starting from the need for healthcare, traversing intermediate investigation and ending by a healthcare outcome. This flow is highlighted by [130] in the description of a general clinical pathway guide that implies the movement from one stage to another.
We argue that although a healthcare process model looks complex at the first glance, there must be a mainstream pattern of care followed. Taking into consideration the process nature in terms of sequential direction of events and under the assumption of the hidden linearity in healthcare process, it would be a good idea to prefer a model that has captured the highest
linearity between states. This model is anticipated to have the best cut off points between blocks of care/state but is not necessarily the best fit for the data.
(b) State compactness
State compactness aims to measure the similarity of inner state processes. It is an important metric for demonstrating the validity of process clustering and quantifying state variance. There are different internal cluster validation metrics that can be used, such as entropy based metrics, however, we aim to use a metric that is more appropriate for sequence clustering validation. It is preferable to have a high compactness score which means high similarity of processes.
(c) Cross state similarity
Cross state similarity aims to measure the similarity of processes between states. This is to ensure relatively distinct states and to reduce the chance of overlapping events between states. Cross state similarity is measured by the number of common nodes and common edges. Models with high dissimilarity score between states are desirable.
(d) State importance
A state can be defined as significant if it is activated by most of the cases. In this thesis we set a threshold of state importance to be more than or equal to 50% of cases in order to capture the main process followed by at least half of patients.