2. ANÁLISIS COMPARATIVO ENTRE LOS MODELOS DE PLANIFICACIÓN
2.3 ANÁLISIS COMPARATIVO DE LOS MODELOS DE PLANIFICACIÓN
The basic idea of regression splines is to work with only a small number of knots( ˜M << N).
One has to find suitable knotsK in that sense that the placement and also the number of knots are responsible for the roughness of the curve. This concept my be understood as adaptive selection of knots and their placement. Here the number and position of knots strongly determine the degree of smoothing. The position of knots may be chosen uniformly over the data, at appropriate quantiles or by more complex data-driven schemes.
3.1 Short Review on Splines in Semi-Parametric Mixed Models 27
For a detailed discussion of these issues see Friedman & Silverman (1989),Wand (2000) or Stone, Hansen, Kooperberg & Truong (1997)
Another idea is to take a (equidistant) fixed grid of knots. That is the main difference to smoothing splines, since the knots are chosen individually ( how many knots, range of the interval where the knots coming from ). A spline function for a fixed grid without penalization is visualized in Figure 3.4. For this Figure and Figure 3.3, a random effects model with only one smooth covariate (α(u) =sin(u)) was used. Therefore forty clusters with five repeated measurements each were simulated. The random effect was assumed to beN(0, σ2
b),σ2b = 2and the error term was assumed to beN(0, σǫ2),σ2ǫ = 2. In Figure
3.3 and 3.3, the concept of regression splines was used. In Figure 3.3, the smoothing parameter λwas set to zero and in Figure 3.4, it was set to sixty. It is obvious that the roughness of the curve has to be penalized. On the other hand a spline function is desired that is independent of the placement and the number of knots.
0 1 2 3 4 5 6 -1.5 -1.0 -0.5 0.0 0.5 1.0
Figure 3.3: The red points describe the given data, the blue line in the figure is a spline computed with 40 knots and B-splines of degree 3. (no penalization)
Again the penalization problem for givenρis
N X
i=1
li(β, α, θ)−λP(s(.))→min, (3.9)
wheres(u)has the functional form s(u) = PMj=1φj(u)αj andP(s(.))is a roughness
3.1 Short Review on Splines in Semi-Parametric Mixed Models 28 0 1 2 3 4 5 6 -1.5 -1.0 -0.5 0.0 0.5 1.0
Figure 3.4: The spline in the figure is the optimal spline with respect to the penalty andλ. Using the penalty term reduces the influence of knots. Here 40 knots and B-splines of degree 3 were used.
P(s(.)) = R(s′′(u))duas an measure for the curvature or the roughness of the function
s(.). If the basis function representation of a spline is used the penalty term has the form
P(s(.)) =αTKα (3.10)
whereKis a matrix with entriesKij =Rφ′′i(u)φ′′j(u)du.
Eilers & Marx (1996) introduced a penalty term where adjacent B-spline coefficients are connected to each other in a very distinct way. The penalty term is based on
˜
K = (Dl)TDl, where Dl is a contrast matrix of order l which contrasts polynomials of the orderl. Using B-splines with penalizationKone penalizes the difference between adjacent categories in the formλαTKα = λPj{△lαj}2. △is the difference operator
with△αj = αj+1−αj,△2αj = △(△αj) etc., for details see Eilers & Marx (1996).
Usually the order of the penalized differences is the same as the order of the spline (B- Spline).
In Figure 3.4, one can see that penalization reduces the influence of knots, which has an effect on the roughness of the curve. So penalization reduces also the influence of the number and placement of the knots. Another number of knots with different placements would deliver a quite similar spline function solution.
3.1 Short Review on Splines in Semi-Parametric Mixed Models 29
Dl corresponding to B-Spline penalization (see Eilers & Marx (1996)) in an recursive
scheme. WithDbeing the(M −1)×(M)contrast matrix
D= −1 1 −1 1 . .. ... −1 1
one obtains higher order differences by the recursionDl=DDl−1which is a(M−l)×M
matrix. This can be used for a more simple and intuitive definition of the penalty than equation 3.10.
A similar argumentation is used for the truncated Power Series basis where the penalty matrix is simply set to
K=bdiag(0(d)×(d), IM−d).
where0(d)×(d)is a d-dimensional quadratic zero matrix, andI(M−d)is the identity matrix of dimension(M −d).
3.1.5 Identification Problems: The Need of a Semi-Parametric Representation
Problems in additive modeling based on splines arise, if intercepts or splines for other covariates are used. If no further restriction of the splines is made, the resulting splines are not clearly identifiable. This is illustrated in the following example
Example 3.1 : Rewriting an additive term to a semi-parametric term One can write the additive term without parametric terms
α(u) = 10 +u2, foru∈[−3,3]
to a semi-parametrical representation
α(u) =β0+ ˜α(u) = 10 +u2, foru∈[−3,3]
withβ0 = 10andα˜(u) = u2. But alsoβ0 = 5andα˜(u) = 5 +u2 is a valid semiparametric
parametrization for the additive termα(u).
The interest is often in the population mean level and in the absolute deviations from this mean as a function of a continuous covariates. It is a natural idea to center the continuous covariates
3.1 Short Review on Splines in Semi-Parametric Mixed Models 30 -3 -2 -1 0 1 2 3 0 10 20 30 (a) -3 -2 -1 0 1 2 3 0 10 20 30 (b)
Figure 3.5: (a) may be seen as pure additive spline smoothing, (b) may be seen thatα(u)
describes the absolute deviation from zero. In this case the level isβ˜˜0 =
R
b aα(u)du
b−a . The
desired property of the additive termα˜˜(u)is
R
b
aα˜˜(u) = 0
around zero. Figure (3.5) shows the difference in the interpretation. Nice benefit of this restriction
is that the semi-parametrical representation is identifiable. 2
Also two or more additive terms have to be rewritten to semi-parametric terms, since additive terms should be identifiable. This is illustrated in the
Example 3.2 : Rewriting additive terms to semi-parametric terms One can write additive terms
α(1)(u) = 10 +u2, foru∈[−3,3] = [a(1), b(1)]
α(2)(v) =−5 +v3, forv∈[−3,3] = [a(2), b(2)]
to the additive predictor
µadd(u, v) =α
(1)(u) +α(2)(v).
Again α˜(1)(u) = 5 +u2 andα˜(2)(v) = v3 corresponds to the same additive predictor since
µadd(u, v) = ˜α
(1)(u) + ˜α(2)(v). Using the same idea described in example 3.1, one gets identifi-
able additive terms by the reparametrization of the additive predictor to semi-parametric terms µadd(u, v) =β0+ ˜α˜(1)(u) + ˜˜α(2)(v) with propertiesRb(1) a(1) ˜ ˜ α(1)(u) = 0, Rb(2) a(2) ˜ ˜ α(2)(v) = 0andβ˜˜= Rb(1) a(1)α(u)du b(1)−a(1) + Rb(2) a(2)α(v)dv b(2)−a(2) . 2
So the basic idea for the identifiable additive termsα˜(u)is to rewrite additive termsα(u)
to a semi-parametric consideration withRabα˜(u) = 0. Soα˜(u)has the form
˜ α(u) =α(u)− Z b a α(u) b−adu. (3.11)
3.1 Short Review on Splines in Semi-Parametric Mixed Models 31
Using simple analysis, one can show that equation (3.11) holdsRabα˜(u) = 0.
Sinceα(u)is often approximated by a spline function that is composed of basis functions
α(u) ≈ φT(u)α, where φT(u) = (φ
1(u), . . . , φM(u)), the discrete version using the
coefficients of basis functions can also used to get regularizedα˜T = (˜α1, . . . ,α˜M)with
restrictionPMj=1α˜j = 0. A regularized version may be obtained by
˜ α=α− PM j=1αj M , α˜m=− MX−1 j=1 ˜ αj. There term P M j=1αj
M is often understood as shift in the level of the function α(u). For
detailed information on these restrictions, see the Appendix A.1.