• No se han encontrado resultados

Diagnóstico de fallas del acceso a la administración de justicia en Colombia

5.3 La situación de acceso a la administración de justicia en Colombia

5.3.2 Diagnóstico de fallas del acceso a la administración de justicia en Colombia

The sweeping strategy proposed in (Schetinin et al., 2004) exploits a new prior on moves that assign or update a rule of a DT splitting node so that to decrease the probability of making unavailable moves – the problem with such moves was discussed in the previous section. The idea behind this prior is to use a proposal variable uniformly distributed within a min-max range of data points assigned to the chosen node. It has been found that this prior is able to prevent DTs from an excessive growth which affects the ability to generalise unseen data.

For each birth or change move, the proposal parameters are drawn from the given priors to be assigned to a chosen node. The proposed change can be made so that one or more terminal nodes in the DT will contain fewer data points than that allowed by pmin. If such a change is accepted, within the sweeping

strategy a node with the fewer samples is removed from the DT being counted as the death move. If, however, there are more than one such nodes, the MCMC algorithm assigns the proposal unavailable in order to keep the balance between the death and birth moves.

When the birth move adds a new splitting node with the parameters drawn from the given priors, the MCMC assigns a new splitting variable svar

i as well as

a new rule srulei taken from a uniform distribution over values of variable svari

at the partition l:

srulei ∼ U (min(X(l)), max(X(l)), (3.34)

where X(l) are the values of the variable svar

According to this prior, the first partition is made over all data points X(1) that come at the root node. The second partition is made over a subset of data points, X(2), coming from one of two branches of the root node. The data

are further split while the terminal nodes contain, at least, 2pmin data points.

Figure 3.2 illustrates the changes in the boundaries xmin and xmax, that are

determined by the above Eq. 3.34, for the first two partitions.

x x

min xmin xmax

(1) (2) (2) x(1)

max

Figure 3.2: Illustration of changes in the boundaries xminand xmaxfor the first

and second partitions.

During MCMC integration, the birth or change moves can produce a splitting node in which one of two branches contains fewer data points than pmin. If this

condition is met for one splitting node, this node is removed. More rarely, this condition is met for a branch with two or more nodes. When this happens, the proposal is assigned unavailable and the MCMC algorithm makes a new proposal.

The above condition can be met for the change move, when one terminal node is removed. In this case, the likelihood of the new DT model can be slightly smaller than that of the previous model, and so the proposed change will be most likely accepted. The sweeping strategy removes a node in which after the change move the number of data samples becomes fewer than pmin from the

DT model. This strategy is applied during the both, burn-in and post burn-in, phases.

3.6

Summary

We introduced the Bayesian methodology of averaging over decision tree models and showed that the Markov chain Monte Carlo simulation method allows us to implement this methodology. The analysis of this method has revealed a number of problems which are mainly related to variable dimensionality of decision tree models, their hierarchical structure and large number of possible configurations. The main approaches to the problems were described and reviewed in the light of both accuracy of approximation and usage for solving real-world applications.

Chapter 4

Influence of EEG Artefacts

Newborn EEG are often contaminated by artefacts which can affect the ac- curacy of maturity assessments. To improve the accuracy, it is important to detect the artefacts and mark the affected segments to be removed. Experts can spend hours to recognise the various types of artefacts within the context of changing EEG patterns. The wide variations in artefacts and EEG patterns make it difficult to apply standard rules to artefact detection. In the absence of rules, the manual marking of artefacts may be inconsistent between experts and recordings. The inconsistencies in artefact removal may affect the accuracy of Bayesian assessments.

In this chapter we hypothesise that automated techniques, removing arte- facts consistently in all recordings, will provide better results within Bayesian assessments than the manual removal. To test the hypothesis, we explore how the removal of marked artefacts and automatic artefact detection with various techniques improve the accuracy of Bayesian assessments of brain maturity.

The manual and automated artefact removal techniques are discussed in Section 4.1. In Section 4.2, we describe experiments with the artefact removal techniques. The first experiments test whether the removal of marked artefacts improves the assessments; we compare the accuracies on EEG data including artefacts and on clean data after removal of artefacts marked by experts. Next, we test a standard technique of averaging over EEG segments to suppress the influence of artefacts. We then describe and test two techniques for automatic removal of artefacts with abnormally high amplitudes. We summarise the results of the techniques and conclude the chapter.

4.1

Manual and automated artefact removal

EEG artefacts need to be recognised and removed to reduce the chance of mis- taken assessments. In case of visual assessments, the artefacts can be mistaken for an EEG pattern. For example, the electrode movement artefacts may be con- fused with high-amplitude delta waves characteristic of very pre-term patterns. In case of automated assessment, the features extracted from contaminated EEG will be affected by artefacts. In particular, the spectral features computed within the FFT, which assumes a stationary signal, may provide biased results, because the artefacts make EEG data highly non-stationary (Clarencon et al., 1996).

To remove the artefacts, EEG experts analyse the recordings and mark the affected segments. The detection of artefacts is time consuming and difficult, as the artefacts widely vary in appearance and can occur within various EEG patterns, so that developing and applying standard rules for detection becomes infeasible. Under the lack of rules, the marking of artefacts becomes subjective. According to van de Velde et al. (1999) the agreement between two experts marking artefacts in the same recording is on average 76%, whereas one expert analysing the same recording repeatedly marks only around 80% of the artefacts that were detected the first time. The inconsistencies in marking of artefacts may affect the accuracy of automated maturity assessment.

Computer-based techniques of EEG artefact removal provide consistent re- sults, and thus we hypothesise that the use of such techniques will improve the accuracy of Bayesian assessments of brain maturity. The artefact removal tech- niques are typically based on deleting EEG samples with abnormal features. In general, artefacts can be considered as abnormal events whose characteristics are different from normal EEG. For example, the artefacts caused by patient’s movements have much higher amplitudes than those of normal EEG (Nolan et al., 2010). Therefore, movement artefacts appear as outliers in the distribu- tion of EEG amplitudes. A simple technique for removing the artefacts is to delete the samples whose amplitudes exceed a threshold given as the mean plus standard deviation of the EEG amplitude distribution.

A weakness of this technique is that a single threshold is used for the whole recording, and variations of EEG amplitudes over the patterns are not taken into account. This means that in EEG patterns with low dominant amplitudes the artefacts can be missed, whereas in patterns with high amplitudes EEG data can be lost. Therefore, it is desirable to adapt the threshold to EEG variations. In cases when the frequency the artefact is well defined, the artefacts can be removed by band-pass filtering without significant loss of EEG information. For example, a notch filter set to 50 Hz can be used to remove the electrical

mains interference (Sanei and Chambers, 2007). Such filtering can be useful if high-frequency EEG waves need to be analysed. In assessment of newborn EEG however, frequencies above 30 Hz are not typically considered.

When EEG has been recorded from multiple channels, the Independent Com- ponent Analysis (ICA) can be applied to minimise the artefacts. This technique attempts to separate the EEG signal into statistically independent sources. The sources that are found most strongly affected by the artefact are then elimi- nated and the remaining sources are mixed to obtain a cleaned EEG signal. This technique has been shown successfully reducing the influence of artefacts, however, EEG experts have raised concerns that ICA can distort the power spectrum of EEG (Castellanos and Makarov, 2006). The ICA-based artefact removal requires the number of EEG channels to be at least the same as the number of sources, and this technique cannot be applied to recordings with only two-channels.

Alternatively to removing the artefacts, another standard approach is to suppress the influence of artefacts by averaging over EEG features computed in multiple short segments. The averaging suppresses the transient variations and artefacts occurring in the individual segments, and therefore the averaged features are more reliable for EEG analysis (Cooper et al., 2003; Kropotov, 2009). Importantly, the short segments can often be considered as pseudo- stationary, unlike the whole EEG which is highly variable. This means that the FFT applied to the segments can provide more reliable results. The choice of segment length is a trade-off between frequency resolution and stationarity. The lengths from 2 to 20 sec are typically chosen (Cooper et al., 2003).