4. PLANTEAMIENTO DEL DOCUMENTO 18
4.4. FASE 4 ANÁLISIS CRITICO COVID 19 33
To see how an atlas can be related to a classical pattern recognition problem, let us consider a general pattern recognition framework as depicted in Fig. 6.1. Here, we distinguish between learning and classification. In learning, we are concerned with the choice of data representation and/or to extract characteristic features of the data. This is done with the goal of finding an appropriate statistical model that can describe and/or capture salient properties of the data. The statistical model is typically parametric (in order to be compact) and its parameters are estimated from a set of learning patterns. Based on this model a decision rule is defined, which is then implemented as the classifier itself. New patterns are processed in the same manner as the learning patterns before they are classified or they may possibly be characterized in some manner. In our case this process consists of identifying regions of abnormal brain perfusion in SPECT images.
The exact preprocessing steps to be performed might vary slightly. In general, however, registration, segmentation and normalization of the images is necessary, Fig. 6.2. When com- paring brain images, a first step is to align them. This is done using registration algorithms. We describe different problems and possibilities of registration in Sec. 6.4. Second, the ob- served gray values must be made comparable. As we have seen in Ch. 2, the total number of photons emitted may vary due to, among other things, variation in the injected dose. Such variations lead to global differences in the image intensities and must be compensated. We describe and review the methods of intensity normalization of SPECT images in Sec. 6.6. Finally, only the brain is considered for statistical modeling. This is because the activity distribution in the surrounding tissues has properties different from those of the brain (Sec. 2.6.2). Furthermore, a segmentation is sometimes necessary for the intensity normalization of the images. The segmentation of the brain does not represent a major concern, but we point out the different techniques that have been applied in Sec. 6.5.
6.2.1
Non-probabilistic approaches
The focus of this thesis has been on probabilistic atlases. Another possible, non-probabilistic approach could easily be deviced based on prototypes and nearest-neighbor comparisons (tem- plate matching). Beside the large memory requirements of this method, we see two inconve- niences. First, in such an approach there is a great risk of sampling prototypes in the tails of
1
Actually, what has fascinated the human being is the question of what a human is. The brain seems to be a good place to look for answers.
6.2 Overview, construction and modeling 93
Choice of features and data representation Learning/ Model estimation Classification (Decision) Measure features Preprocessing Preprocessing Learning Classification
Learning patterns Test patterns
Figure 6.1: A general model for statistical pattern recognition. Similar to [110, 58].
Spatial normalization Database of normal subjects Intensity normalization Atlas creation: Estimation of model parameters Comparison Score image Image to evaluate Test image Atlas Intensity normalization Spatial normalization Learning Classification
Brain segmentation Brain segmentation
94 Models and preprocessing: overview and state of the art
the distribution and therefore sometimes comparing an image to a prototype which is close to abnormal. This makes this approach less robust. The only way of avoiding this problem is to perform some kind of density estimation (parametric or non-parametric), which becomes prohibitive with the large dimension of the observation space due to the curse of dimension- ality. We are therefore in a situation as described in Sec. 3.2.3, p. 31 where the use of a dimension reduction techniques (such as PCA) becomes a necessity. Second, we expect an approach using a dimension reduction technique to generalize better to unseen images than an approach based on prototypes as it tries to reveal “deeper causes” (structure or patterns) that can explain the observed data.
6.2.2
Pattern recognition and hypothesis-testing
We have seen how the creation and use of an atlas can be considered to be a pattern recognition problem. Most methods, however, that have been applied in functional brain studies are based on statistical hypothesis testing (e.g. whether normal perfusion depends on age, gender etc.). How does an atlas fit into a hypothesis testing framework?. To answer this question and in order to understand similar statistical models in the literature, it is necessary to clarify the concept of experimental design (also called study design).
6.2.3
Experimental design
In pattern recognition, we typically are given an application or some kind of system that we would like to reproduce. This can for example be an automatic system that recognizes a person in a picture. The input and (desired) output of the system is therefore defined. In statistics, however, we anticipate some behavior of a complex system and design an experiment that can highlight this behavior. The statistical analysis, of course, depends on this design. In statistical terms, we say that we make an inference about the behavior.
The goal of the design is to obtain a valid analysis that is as sensitive as possible. The validity of a statistical experiment is assured by randomization and replication. The sensitivity can be augmented by using analysis of variance models (ANOVA). For further discussion on these issues, see for example [79]. Let us however clarify some typical aspects on design types that helps in the understanding of the neuroscience literature on functional anatomy.
First, we distinguish between studies that are dynamic or not. In dynamic studies (i.e. fMRI studies or cardiologic scintigraphic studies) the temporal order of the images has a significance and eventual correlations between successive images must be accounted for. This is done in fMRI studies by taking the so-called hemodynamic response into account. In studies that are not dynamic, the analysis is invariant to the ordering of the images.
Second, we can distinguish between single-subject or multi-subject studies and, in the latter, whether one or multiple scans are obtained of each subject. In addition to adding inter- scan variability to the model (two-way ANOVA), it can be of interest to model inter-subject variability in multi-subject studies (three-way ANOVA) [225]2.
Third, in order to determine the necessary sample size, one distinguishes between two classes of inference: inference about “typical” characteristics or about “average” characteristics of a population. This leads to the notions of fixed - or random-effect models respectively [225, 65]. The former makes an inference about a specific sample (e.g. 5 out of 6 farmers owned a tractor - it is typical that a farmer owns a tractor), the latter makes an inference
2