3. Nivells d'arrencada i serveis
3.3. Systemd
In order to examine the relationship between chl-a and the potential forcing parameters (i.e. the meteorology) generalised linear modelling (GLM) is introduced. GLM can be used to examine the interaction between a response parameter (in this case chl-a) and one or more predictors (here, meteorological forcing). GLM is an umbrella name for a variety of relationships between response parameters and predictors, which have a variety of distributions. The
characteristics of a GLM are: the response has a distribution that may be normal, binomial, Poisson, gamma, or inverse Gaussian, with parameters including a mean
µ; a coefficient vector b defines a linear combination X*b of the predictors X; and
a link function f(·) defines the link between the two as f(µ) = X*b (Dobson, 1990).
For this study the simplest sort of GLM was employed: it assumes a normal distribution of response and predictors and a linear relationship between them. The response parameter is chl-a (C) and the predictors are net heat flux (Q), sea surface temperature (T), wind speed (U) and PAR (P). The form of the model is:
y = Xβ+ ε [Eqn 7.1]
where y is the observations (chl-a), X is the predictors (meteorology), β are
regression coefficients and ε is the error. The term X not only contains Q, T, U
and P, but also terms for the interaction of two or more variables. An example of the full equation is:
C = ε + β1.Q + β2.T + β3.U + β4.P + β5.QT + β6.QU + β7.QP + β8.TU + β9.TP
+ β10.UP + β11.QTU + β12.QTP + β13.QUP + β14.TUP + β15.QTUP [Eqn 7.2]
The solution of the equation will be essentially a linear least squares fit to the observations (chl-a).
After the parameters are transformed to a normal distribution using the Box-Cox method (for details see Chapter 6.1.1) the Matlab function regress is used to solve the equation. The equation is solved initially for all terms, then the terms are sequentially removed and the equation solved again. The purpose of this is to determine which parameters, when removed, have a significant impact on the results of the model. The question is whether a more complex model adds to the explanatory power over a simpler model (i.e. one with fewer terms). A ‘significant impact’ is judged by the F-statistic:
)) ( ( ) ( ) ( ) ( n m t RSS m RSS RSS F m n m n n + − − = + + [Eqn 7.3]
where RSS is the residual sum of squares, n is the number of terms in the simpler model, m is the difference in number of terms between the simple and the more complex model and t is the total number of observations. If F is greater than a critical value then the model is deemed to have been improved by the increasing complexity. A table of critical F values can be found at the back of almost every statistics textbook. Finding the critical value requires knowledge of the number of degrees of freedom both between the terms (i.e. the total number of terms) and within the terms (i.e. the number of observations).
As varying the order of the predictors in the equation produces different results, a series of tests are required. An example of the result of this process is shown in Table 7.1. These results are for the East Greenland region in 2002. The first row compares a model with just P (letters in first column) to a model with an increasing number of terms: first a model with just P is compared to a model with P and U, then a model with just P is compared to a model with P and U and T etc. (letters in top row). The second row compares a model with P and U together to a model with P, U and T and so on. The highlighted numbers are those above the relevant critical F value. In this case a model with P on its own is always improved by adding more terms (row 2), and similarly for a model with P and U (row 3). Once T is added (row 4) the model’s explanatory power is increased, and only a couple of the three-variable interaction terms improve the model; and so on. This indicates that a combination of P, U and T is a good predictor for chl-a.
A series of tests were carried out for each of the CIS, RR and EG zones, dividing the data into pre-bloom and post-bloom sections (pre-bloom is day of year 50 to 150: 19th February to 30th May; post-bloom is days 150 to 250: 30th May to 7th September). A selection of the results are presented in Tables 7.2 to 7.7 for the pre-bloom and post-bloom CIS, RR and EG regions respectively in the same format as the example above. In summary, in the pre-bloom CIS region chl- a is well explained by net heat flux and PAR. Post-bloom, temperature and wind
speed totally dominate the model, with PAR playing a secondary role. No one meteorological factor dominates the model in the pre-bloom RR region, although the interaction term between Q and P is influential as the model becomes more complex. Post-bloom temperature and wind speed are influential and, unlike in the other regions, adding the higher order interaction terms to the model improves it. No one factor dominates the model in the pre-bloom EG region, whilst post- bloom wind speed has limited influence.
The model suggests that the balance of forces affecting the chl-a is different in each of the three biological zones. The full 16-term GLM equation was used to derive the coefficients (βn of equation 7.2) which maximised the
correlation between the predicted chl-a and the measured chl-a. The equation derived at one location (representative of, for example, the CIS zone) was then applied across the whole basin. For the CIS the coefficients were derived for the time series of data at 60 °N 36 °W; for the RR at 58.5 °N 32.5 °W, and for the EG at 62 °N 40 °W (as in previous analyses). At each pixel the relevant
meteorological data and the derived coefficients were used to estimate chl-a. The correlation coefficient between the measured chl-a and the predicted chl-a, as derived separately for the CIS, RR and EG zones, is plotted in Figure 7.1. Although the coefficients are derived for one pixel only, the area over which the predicted chl-a correlates well with the measured chl-a extends throughout the biological zones. Within the region for which the coefficients were derived the correlation is ~0.7, but quickly decreases to <0.3 outside of the region. The boundaries between the regions are thus well delineated and correspond closely to the biological zones deduced from the EOF analysis (see Chapter 5.3 and Figure 5.9). This is further confirmation that the biological zones used throughout this study are robust, but more than that, it also indicates that the balance of physical processes influencing the chl-a signal is different in each region, and that this balance is different pre- and post-bloom. The conditions for initiation of the bloom are examined in the next section and the post-bloom period is returned to in Section 7.3.