CAPÍTULO 2. ANÁLISIS Y DISEÑO DEL PROTOTIPO
2.1. requerimientos
2.8.4. Instalación y configuración de software base
PET and fMRI characterise neurovascular activity using different dependent variables. These differences will influence subsequent data analysis. While PET averages over a time period of around half a minute and each scan is treated as independent, fMRI data are time series data with a theoretical resolution of less than a second. Accepting these differences, PET and fMRI analyses still share a common core of theoretical concepts. A common approach to the analysis of data
from both modalities is the use of a form of the general linear model (GLM). One of the most popular embodiments of the GLM in neuroimaging analysis is the Statistical Parametric Mapping (SPM) software package (http:/www.fil.ion.ucl.ac.uk/spm), which is used throughout this thesis. The GLM will be introduced herein by first describing the terminology and some issues regarding its application to neuroimaging data. Issues specifically pertinent to the analysis of PET scans and fMRI timeseries will be introduced subsequently. 2.4.2.1 Statistical Parametric Mapping (SPM)
The concept of Statistical Parametric Mapping was introduced by Friston and colleagues, and although the approach owes much to previous work on change distribution analysis pioneered by the St.Louis group (Fox and Mintun, 1989) there are substantive differences. SPMs are three-dimensional images of statistical values, such as ts or F’s, recorded over a volume o f interest that can range from the entire brain to a single plane through a structure of interest. The intensity of each voxel (volume element) represents the value of the statistic in question under the particular hypothesis being examined. The underlying philosophy of SPM is ably summarised by the following quote: ‘ ...one proceeds by analyzing each voxel using any (univariate) statistical parametric test. The resulting statistics are assembled into an image that is then interpreted as a spatially extended statistical process.’ (Friston et al., 1995b). SPMs efficiently summarise a vast body of data in a form that is by far easier to examine and interpret.
2.4.2.2 Gaussian Fields and Non-Independence
Before introducing the GLM, it is important to discuss some issues that frequently arise in the analysis of neuroimaging data. In a typical single fMRI volume there can be as many as 200,000 voxels. As the researcher typically wishes to test a regionally specified hypothesis, it is necessary to perform the appropriate univariate statistical tests at each and every voxel At the standard rejection rate of p<0.05, we would expect by chance to reject the null hypothesis (Ho) of no experimental effects in a twentieth o f our sample i.e. 5% of 200,000 - 10,000 voxels! One can therefore obtain a perfectly reasonable number of activated voxels in the imaging volume by simply testing one’s hypothesis
enough times, and reporting only the instances in which Ho is rejected while ignoring the times it is not (called the 'file drawer’ ^voh\Qm\ Abelson, 1989).
The standard solution to this problem is to use a technique called Bonferroni correction, which simply adjusts the p value at which Hq is rejected such that is reflects the number of tests being performed. For example, in the above case one would divide the usual p value of 0.05 by 200,000, to get a corrected p value o f 25 X 10^-7. However, although separate univariate tests are performed at each voxel, the resolution of most neuroimaging data renders voxels non-independent (see section 2.4.1 above). The Bonferroni correction is therefore too conservative.
Instead, the cmcial statistic level to reject Ho in the univariate tests used by SPM is calculated by the application of Gaussian random field (GRP) theory. GRP deals with ‘the behaviour of stochastic processes defined over a space of D dimensions’ (Poline et al., 1997). Knowing about the behaviour o f Gaussian fields under certain constraints (Adler 1981; Worsley, 1994) it is possible to determine the probability of a given ‘local excursion’ of the Gaussian field at any location: in simpler terms, to assign the correct significance to activations at a voxel- specific spatial scale.
Without dealing with this theory in exhaustive detail, it is important to appreciate that the pre-processing of neuroimaging data is necessary before GRP theory can be utilised. Images must approximate ‘a continuous, zero-mean, unit variance, homogenous, smoothed Gaussian random field’ (Poline et al., 1995). To fulfil the latter part of this assumption and to ensure that the spatial correlation structure of the data is stationary the images must be spatially smoothed using Gaussian filter kernels, lowering the resolution of the data from its ‘raw’ state. The full-width-half-maximum (FWHM) of the smoothing kernel should typically be at least 2 or 3 times larger than the initial voxel size - for example, it is common practice to smooth fMRI data with a ‘raw’ resolution of 2x2x2mm with a Gaussian kernel of 6mm FHWM. Although this may appear to defeat the purpose of acquiring high-resolution functional data, it is important to remember that the underlying neurovascular signal has already spatially smoothed the BOLD signal. While it possible to acquire functional data of submillimeter
resolution, the effective resolution of the data is determined by the spatial congruence between metabolic demand and vascular supply.
Smoothing neuroimaging data is an important process, for a variety of reasons: it raises the signal: noise ratio of the data; it facilitates the detection of activations which have the same size as the filter kernel (by matched filter theorem); and, perhaps most importantly, smoothing facilitates the detection of group-level activations by removing residual differences in functional neuroanatomy that remain after normalisation.
A further point to note when using Gaussian field theory is that the statistic must have enough degrees of freedom (dfs), or Gaussian field assumptions break down. It is usually the case that one has more than enough dfs in the analysis of fMRI data, but in single subject PET designs with a large number of conditions this can be problematic. In these cases it may be more useful to use non- parametric analysis methods (Holmes et al., 1996).
2.4.2.3 The General Linear Model
At present there are a number of different analysis ‘packages’ used by the neuroimaging community. As mentioned above, the great majority o f these packages employ univariate linear tests at each and every voxel to attempt to reject the Hq. All parametric versions of these tests, ranging from simple
correlation tests of a single stimulus vector (e.g. STIMULATE; Strupp, 1996) to more sophisticated analysis frameworks employing ANCOVAs, can be thought of as singular cases of a unifying analysis framework, the general linear model. The general linear model lies at the heart of the SPM package, and was used for all subsequent analyses.
The GLM allows for a great deal o f flexibility in the design and analysis of experiments. Variation in the dependent variable Y (rCBF or BOLD contrast for PET and fMRI respectively) is modeled as a linear combination of a number of explanatory variables, plus an error term. The explanatory (independent) variables are each denoted by xji (where L is the total number of explanatory variables). An important point to note is that the use of the GLM is predicated upon the
assumption that the variance attributable to the explanatory variables is linearly separable. The GLM formulation for a single observation at a voxel is:
Y j - P l Xy; + . . . + P i X j i + . . . + Æ Xyi + Zj ( 6 )
The errors (£y) are assumed to be independent and Normally distributed with zero mean and variance c/. This particular error term is between-scan or intra session variance, and is denoted by c/g throughout the remainder of this thesis. Fitting the GLM at a voxel allows for different c^e across voxels, but does carry with it the assumption that c/g is constant across experimental conditions and different subjects. The xs can take two forms: covariates (‘real’ levels of a particular variable, such as time or the concentration o f a pharmacological agent)
and indicator variables (in which different values of an experimental factor are
assigned integer values, i.e. using VAS measurements of subject pain as an explanatory variable) (Friston et a l, 1995b). As the above difference does not influence the evaluation of the GLM, I will use the catchall term ‘covariate’ when referring to explanatory variables from hereon in. To fit the model of xy/s to the data Yj, the parameters of the model must be estimated {fii)- For the single observation o f Eqn. 6, a relationship between the values of each experimental factor X and Y must be evaluated. Although in neuroimaging experiments there are usually more than one x to fit to 7, it is helpful to remember that in the limiting case of L=1, Eqn. 6 reduces to:
Yj p2 ^2j + %
which is simply linear regression! The "xijjY term is the 7-axis ‘intercept’, and introduces the use of a ‘dummy’ variable {xji) whose values is one for all J scans. This allows mean or ‘constant’ terms to be included in the GLM formulation, and so explicitly model condition, subject or even population means in the design matrix.
The above example usefully illustrates that the may be helpfully thought of as ‘regression slope’ co-efficients, describing the size of the relationship between 7 and the xs. However, it is rare for a neuroimaging experiment to reduce to
simple linear regression. Similarly, it is rare for only one observation to be taken of each voxel, and so for every scan j there is a corresponding Eqn. 6, such that:
Y j- /3i
Xj i P i Xj i +J3l
Xj l + £;Y j -p j X j j + . . . + P i X j l + . . . + p L X jL + £y
Y j — P j X j j + . . . + Pi Xj i -^ . . . + p L Xj l + £ y
This rather cumbersome formulation can be summarized by:
Y = X p + z
(7)
which is a multivariate GLM where Y is a column vector o f observations, p a column vector of parameter estimates and e a column vector of error terms. X
represents a concept that is essential in the application of the GLM in neuroimaging analysis: the design matrix. To summarize, to test for experimental effects at a given voxel the data Y are collected over J scans, the experimental model of explanatory variables X is fitted to the data, and the column vector of parameters P is estimated. The ‘goodness-of-fit’ of the model to the data is indexed by the errors e. Maximizing the fit of the model to the data will increase subsequent calculations of statistical significance. Intelligent formulation of the experimental model is therefore extremely important.
2 A 2 A Model Fitting: Estimation, Overdetermination and Inference
Estimation refers to the process of generating values for the model parameters
pi.L (the parameters must be estimated as there are typically less parameters in the model than there are scans). The ‘best’ values of are those that minimise the total distance between the model and the data. This quantity is formulated as the
sum o f squared error (*S) - the sum of the squared distances between each Yj (data)
and (model). Fitting the model to the data such that S is minimized is an example of least squares fitting. The parameter estimates that minimise S are denoted P such that:
(8)
Eqn. 8 is solvable if and only if (X^X) is invertible - that is, there is a unique solution to {X^Xy^. As X is the design matrix, Ç ^X ) is invertible if X is of full rank or non-singular - in matrix algebra terms, there are no columns that can be formed by a linear combination of the other columns {X does not show linear
dependence). Such a matrix is overdetermined. Consider the simple 2x3 matrix X\
O i l
101
The third column of X is a linear combination of columns one and two, so X is overdetermined. This causes serious problems. Because there are an infinite number of matrices whose inverse is X, there are also an infinite number of least square estimates P , As it is a simple fact of experimental design that one will often be faced with the above problem (i.e. in almost all PET studies), it must be overcome. The approach used within SPM to constrain the infinite set of s is to compute the pseudo-invQisQ of A", or pinv (A%).
After fitting the model and determining the parameters, the variance of the model fit is estimated by residual mean squares', the residual sum of squares divided by the degrees of freedom {df = J-p, where p is the rank of X). Usually in neuroimaging, one is less interested in disproving the null hypothesis for the entire design matrix (i.e. Ho = ‘the entire experimental model does not
significantly reduce the error variance’) and more concerned with seeing how much experimental variance can be explained by some linear combination of the model parameters. Linear combinations o f the parameter estimates that are invariant over the space o f possible parameters are called contrasts. A contrast vector in SPM is one whose elements sum to zero. Contrasts are important when the design matrix is not of full rank.
The eventual output o f the above steps is an image where each voxel’s intensity value corresponds to a statistic value (usually a / or an The probability assigned to each statistic is achieved by treating each image volume as a Gaussian random field (reviewed in Poline et al., 1997), reviewed in section 2.4.2.2 above. 2.4.3 Experimental Models for fMRI
The versatility of the GLM allows for a large number of possible experimental designs to be tested. While not wanting to list an extensive taxonomy of design forms (interested readers should consult Friston et al., 1997), it is important to introduce a number of terms that will be used in the following chapters.
Usually experimenters will want to test for the significance of a particular linear combination of the explanatory variables: the significance of these variables is computed after the remainder of the experimental model has been fitted to the data. While there are an almost infinite number of different contrasts arising out of larger design matrices, the evaluation of linear combinations of design matrix columns can usually be classified according to a small number of experimental designs.
While early PET designs were dominated by ‘region-of-interest’ (ROI) approaches where researchers limited the brain areas that they evaluated according to a priori information, the introduction of ‘subtractive methodology’ allowed for assumption-free analyses. Frans Bonders, pioneer of reaction time experiments introduced the subtractive method to experimental psychology in the 19* century. By subtracting the time subjects took to respond to a simple stimulus from a more complex stimulus Bonders claimed that he could isolate the time that subjects needed to perform the ‘differentiation’ inherent in the complex task. Bonder’s methodology allowed researchers to isolate specific cognitive task
components: for example, if a researcher was only interested in the time for subjects to react to differently coloured stimuli, they could use this approach to design a similar control task that differed only in presenting monochrome stimuli. In neuroimaging, acquiring images of two different cognitive states that differ only in the cognitive component o f interest (CCI) should allow the experimenter to isolate the CCI by subtracting the less complex image from the other. One of the biggest practical problems in neuroimaging is ensuring that a control task is good enough for the hypothesis in question.
Subtractive designs make the assumption of ‘pure insertion’; that is, it is possible to introduce a different category of the factor in question without it having any effect on the neurovascular processes mediating the base/control task. To use the example o f Bonder’s experiment, the ‘insertion’ of the more complex task should not cause any interactions. The presence of interactions makes subtraction analyses ambiguous, and violates the assumptions embodied in the simple subtraction design. Although rightly criticised (Friston et al., 1996), subtractive designs can be useful as long as the caveats accompanying their use are kept in mind.
The approach suggested by Friston and colleagues to overcome the problems of cognitive subtraction is that of the factorial design (after Sternberg’s similar extension to Bonder’s method; Sternberg, 1969). A factorial design allows one to explicitly examine interactions between experimental factors consisting of a number of levels. While there are a great many possible ways to conceptualize factorial experiments, it is useful to think of a factorial experiment as measuring a ‘difference of a difference’. Whereas a simple subtractive design is interested in the difference in 7 between two categories {Yi-Yi), the factorial design allows one to evaluate this difference under two different contexts ( //- T ? ^ ) and (7;^-7?^). A psychopharmacological study is a good example: the experimenters are usually interested in drug effects on a particular cognitive process. This can be formulated as a subtraction - say ‘reaction time on the colour Stroop task’ (7/) versus ‘reaction times on veridical colour/word pairs’ {Y^. However, if data are only acquired while subjects are exposed to the drug the experimenter cannot
unambiguously attribute the difference in RTs to the drug itself. By acquiring data while the subject receives the drug in one instance and receives a placebo in the other (the two levels of factor two, A and B above), specific drug x task effects can be examined. As well as the increased power of such a design to unambiguously isolate cognitive components, the very non-linear nature of cognitive dynamics at the neuronal level suggests that experimental designs examining interaction effects are those that will produce the most ‘interesting’ results. However, one caveat is that however hard one tries, some experimental designs cannot be formulated as factorial designs. Experimenters may find that by forcing their design into a factorial framework they ultimately lose rather than gain sensitivity.
2.4.3.1 SPM formulations o f GLM designs
As detailed above, neuroimaging seeks to explain variance in Y at each voxel by applying the GLM and fitting X (the design matrix) at each and every voxel. It is common in neuroimaging to graphically illustrate the design matrix as an ‘image’. The values o f X are represented by grayscale values, so that a value of -1 will be black, 0 mid-gray, and +1 white (Holmes et al., 1997). Using the SPM software package it is possible to test a great variety of different contrasts (and thus hypotheses) by assigning different weights to the columns of the design matrix (and thus the experimental effects). One can think of this as ‘partitioning’ the design matrix so that the significance of particular weightings of its columns that specify a unique hypothesis can be evaluated.
Not all of the columns o f the design matrix will necessarily be of interest to the experimenter for a given contrast. Furthermore, some columns of the design matrix may model effects that the experimenter is never interested in evaluating.