The fundamental concept of the ACS is the data-driven computation of a low-dimensional repre- sentation of the distribution of signals in the high-dimensional signal space. The low-dimensional form of the signal space is given by a set of example signals or prototypes Θ = {wj}, wj ∈
S, j = 1, . . . , npwhich are arranged in a two-dimensional lattice. Each prototype is a point in the considered signal space and can be displayed, like the signals, as a line or bar plot. Additionally,
prototypes can be evaluated and associated with pseudo-colours in the same way as the recorded temporal kinetic signals.
In order to be able to analyse properties of multivariate signals by means of a low-dimensional representation, the structured set of prototypes should suffice the following requirements:
• The set of prototypes has to represent the vast majority of signal variability. It is important to note that not the entire signal space needs to be represented, but only that part filled with signals of the considered data domain. Signals of data domains such as DCE-MRI typically expose several characteristic courses from which the major part of the data exhibits only minor deviations. In this case, the data forms a manifold in the signal space and only this manifold has to be reflected by the structured set of prototypes.
• The set of prototypes has to represent the variability of signals with a certain resolution. This resolution mainly depends on the number of prototypes. The number of prototypes should suffice to represent the main types of signals such as distinct benign, malignant and normal signals in the domain of DCE-MRI data, but also those nuances of signal courses that can be observed for tissue with a less distinct assessment.
• The similarities between prototypes should be reflected by their arrangement in the lattice structure. Prototypes that are close to each other in the signal space should also be close to each other in the lattice structure. This property facilitates navigating in the visualisation of the low dimensional form of the signal space, since signal characteristics vary smoothly between neighbouring prototypes.
7.2.1 Low-Dimensional Forms of Signal Spaces Based on Self-Organising Maps
The low-dimensional representations of signal spaces are computed by applying the SOM algorithm [Ritter et al., 1992, Kohonen, 1995], an unsupervised artificial neural network which forms a topographic map of the input data. Signals which are nearby in the signal space are likely to be located nearby on this topographic map.
The two-dimensional SOM considered in this chapter consists of a lattice of np neurons. The
j-th neuron of the lattice is formally described by a weight vector wj ∈ S. Furthermore, each
neuron is parameterised with respect to an integer coordinate pair lj ∈ Q1 × Q2 with Q1 =
{1, . . . , q1}, Q2 = {1, . . . , q2} and np= q1q2. The weight vectors can be interpreted as prototypes
of signals located in certain subregions of the signal space and represent signal courses which are characteristical for the corresponding subsets of signals.
7.2.2 Self-Organising Maps
In order to determine the set of prototypes forming the topographic map of a given data domain, the SOM is trained with a set of unlabelled training examples ΓTrain = {si}, i = 1, . . . N . The
prototypes are first initialised with examples randomly selected from ΓTrain. In the online version
of the algorithm, prototypes are adapted according to a random sequence of examples. For the example s(t) ∈ ΓTrain selected at iteration step t, the j-th prototype is updated according to
∆wj(t) = λ(t)h(l∗, lj, σ(t))
h
s(t) − wj(t)
W (t) W (t) * * ∆ W (t)j ∆ X Signal Space (t) Wj
Figure 7.2:During training of the SOM, the lattice structure (reflected by gray lines) of prototypes (black discs) unfolds in the signals space according to the sequence of presented training examples. For each training example x(t) ∈ ΓTrain(circle), the best matching prototype w∗(t) is moved by ∆w∗(t) towards the stimuli pattern. Due to the neighbourhood function, prototypes wj which are topologically close to w∗(t) according to the lattice structure are also slightly moved towards x(t).
with a learning rate 1.0 ≥ λ(t) > 0 that decreases with the number of iterations. The neigh- bourhood function h(l∗, lj, σ(t)) leads to stronger adaptations of those prototypes wj with lattice
coordinates lj close to the lattice coordinate l∗ of the best matching prototype w∗(t) (Fig. 7.2).
The best matching prototype is determined by w∗(t) = argmin
wj∈Θ
kwj− s(t)k
with a suitable metric kwj− s(t)k such as the Euclidean distance.
Neighbourhood Function
The neighbourhood function h(l∗, lj, σ(t)) takes an important role in the formation of the topo-
graphical map of the signal space. The most frequently applied neighbourhood function is the neighbourhood kernel
h(l∗, lj, σ(t)) = exp −
kl∗− ljk2
2σ(t)2
!
with the parameter σ(t) controlling the width of the neighbourhood. The width is chosen such that the best-matching prototype w∗(t) as well as the prototypes in a larger neighbourhood of
w∗(t) are initially noticeable adapted. In the course of the training, σ(t) is decreased leading to
a decreasing weight for the adaptation of the neighbouring prototypes. For the case of σ → 0, the update step is limited to the best matching prototype w∗(t) and the SOM algorithm becomes
an online version of the kmeans algorithm. The utilisation of a neighbourhood function which initially encloses a considerable part of the prototypes and then contracts in the course of the training results in a relaxation or smoothing effect on the prototype vectors. By activating the best-matching prototype w∗(t) and the prototypes in its neighbourhood, i.e. those prototypes
that are in the SOM lattice topologically close to w∗(t), the selected prototypes learn from the
Magnification Factor
In order to obtain a low-dimensional form of S that allows for examination of the varying character- istics of signals, the signal distribution has to be represented by a sufficient number of prototypes. Furthermore, if signals located in certain subregions of the signal space are of particular interest, such subregions have to be exposed on the topographic map by an increased number of proto- types. This magnified view on certain subregions enables the user to examine variations of the corresponding signals at a higher level of detail.
In vector quantisation with the squared error function, it has been shown that the density of prototypes is proportional to [p(s)]nin+2nin with ninbeing the dimension of the signal space and p(s)
being the probability density function of the training data [Kohonen, 1995]. The inverse of this is called magnification factor. Even though a connection between the density of prototypes of a SOM and p(s) has not been derived for the general case, Ritter, 1991a derived a similar power-law for the case of a SOM with a one-dimensional lattice structure. Due to this observation and the close relation to the kmeans algorithm, it can be assumed that the density of SOM prototypes roughly follows the density of the training data [Vesanto and Alhoniemi, 2000].
The conclusion that the density of prototypes roughly follows the probability density of the training data has a significant consequence for the design of the training set used for adapting the SOM. In many data domains, but most notably in domains of medical diagnosis, it can be observed that the number of normal signals nNormal exceeds the number of abnormal signals
nAbnormal describing the phenomenon under investigation by far:
nNormal nAbnormal.
In consequence, the prototypes of a SOM adapted with a data set reflecting the true p(s) of the multivariate signals will predominately expose the characteristics of normal signals, whereas the phenomenon under investigation is likely to be represented by only a few prototypes. Hence, the frequency of occurrence of different signal types does not reflect their importance for the given application and need to be adjusted.
In order to increase the number of prototypes representing signals caused by the phenomenon under investigation, the probability density in the corresponding subregions of the signal space has to be increased. This can be done by e.g. replicating the corresponding signals in the training set or by reducing the number of signals that do not represent the phenomenon under investigation. Both modifications increase the statistical frequency of occurrence of the underrepresented signals in the sequence of randomly selected training examples. Therewith, prototypes are more often shifted towards the corresponding subregions of the signal space during adaptation of the SOM. 7.2.3 Visualising Adaptive Colour Scales
After adapting the SOM with a set of training signals, the lattice of prototypes forms the basis of the adaptive colour scales. Each prototype wj is displayed as a chart e.g. as a line or bar plot. The
background of the chart is coloured with the pseudo-colour c(wj) computed by evaluating wj
with one of the investigated pixel-mapping techniques. The npresulting charts are then arranged
according to the lattice structure of the prototypes leading to a q1×q2matrix of diagrams, referred
Signal Space Lattice of Coloured Diagrams w c(w ) x ∋ Train w j j Γ k
Figure 7.3:The lattice of prototypes w, unfolded in the signal space S after training, is visualised as a lattice of diagrams. Furthermore, each diagram is coloured with the background colour c(wj) determined by evaluating the corresponding prototype wj with one of the investigated pixel-mapping techniques.