PROCEDIMIENTO: CAMPUS PRENSA ACTIVIDADES

CAPITULO II: MARCO REFERENCIAL

PROCEDIMIENTO: CAMPUS PRENSA ACTIVIDADES

Several alternatives are available to specify the priors of the unknown (smooth) functions fj, j = 1, .., p. These are basis function approaches with adap-

tive knot selection (e.g. Dension et al., 1998, Biller, 2000) and approaches based on smoothness priors. In addition, several alternatives have been recently proposed for specifying a smoothness prior for the effectf of metrical covariatex. These are random walk priors (Fahrmeir and Lang, 2001), Bayesian smoothing splines (Hastie and Tibshirani, 2000) and Bayesian P- splines (Lang and Brezger, 2005). Our focus in this work is on random walk and P-splines priors.

First and second order random walk

Let us consider the case of a metrical covariatexwithequally-spaced obser- vations xi, i= 1, ....m, m ≤n. Then x(1) < ... < x(m) defines the ordered

sequence of distinct covariate values. Heremdenotes the number of different observations forx in the data set. A common approach in dynamic or state

2.3. PRIOR DISTRIBUTIONS 31

space models is to estimate one parameter f(t) for each distinct x(t), i.e.,. Define,f(t) =:f(x₍_t₎) and let f = (f(1), .., f(t), .., f(m))0 _{denote the vector}

of function evaluation. Then a first order random walk prior forf is defined by

f(t) =f(t−1) +u(t) (2.7) A second order random walk is given by

f(t) = 2f(t−1)−f(t−2) +u(t), (2.8)

u(t)∼N(0;τ2)

with diffuse priors f(1) ∝ const and f(2) ∝ const, for initial values, respectively. A first order random walk penalizes too abrupt jumps f(t)− f(t−1) between successive states. While, a second order random walk penalizes deviations from the linear tread 2f(t−1)−f(t−2) +u(t). In addition, the varianceτ2 _{controls the degree of smoothness}_f_.

f_t|f_t₋₁, τ2 ∼N(f_t₋₁, τ2) (2.9) Random walk priors may be equivalently defined in a more symmetric form by specifying the conditional distributions of functionf(t) given its left and right neighbors. That means, we can write the prior in (2.7 and 2.8) in general form as f|τ2exp µ − 1 τ2f0Kf ¶ (2.10) Here the design matrix K is the penalty matrix that penalizes too abrupt jumps between neighboring parameters. More often,K is not full rank and this implies that f|τ2 follows a partially improper Gaussian prior

f|τ2 ∼N(0, τ2K−)

whereK− is a generalized inverse of the penalty matrix K.

For the case of nonequally spaced observations, random walk or autore- gressive priors have to be modified to account for nonequal distancesδt =

x(t)−x(t−1) between observations.

Random walks of first order are now specified by

f(t) =f(t−1) +u(t), (2.11)

u(t) N(0;δ_tτ2),

i.e., by adjusting fromτ2 _to_δ

t(τ2).

Random walks second order are

f(t) = µ 1 + δt δt−1 ¶ f(t−1)−( δt δt−1)f(t−2) +u(t), (2.12) u∼N(0;w_tτ2_),

wherew_t is an appropriate weight. Several possibilities are conceivable for weights. The simplest one is wt = δt for the first order random walk, see

Fahrmeir and Lang (2001a) for a discussion.

Bayesian P-splines

A closely related approach for metrical covariates is based on the P-splines approach, introduced by Eilers and Marx (1996). The basic assumption of this approach is that the unknown function fj can be approximated by a

spline of degreel with equally spaced knotsxmin =ξ0 < ξ1 < ... < ξr−1 <

ξ_r=x_max within the domain ofx_j. The domain from x_min tox_max can be divided inton0 _{equal intervals by}_n0_{+1 knots. Each intervals will be covered}

byl+ 1 B-splines of degreel. The total number of knots for construction of the B-splines will ben0+ 2l+ 1. The number of B-splines in the regression

2.3. PRIOR DISTRIBUTIONS 33

isn=n0₊_l_{. It is well known that such a spline can be written in terms of}

a linear combination ofM =r+l B-splines basis functionsβ_j, i.e

fj(xij) = ΣMp=1βjBj(x).

The basis functionsBj are defined locally in the sense that they are nonzero

only on a domain spanned by 2 +l knots. Then×M design matrixXj for

P-splines is more intricate than the case of random walk priors. Each row

i contains the value of the B-spline basis functions evaluated at xi, hence

Xj(i, p) = Bjp(xij). In accordance with the properties of B-splines (see De

Boor, 1978), each row X has M+ 1 non-zero values. As for the number of knots, Eilers and Marx (1996) recommended the number of inner knots to range between 20 and 40 and introduced a penalization of the differences between regression coefficients of adjacent B-spline basis functions in order to generate a smoothing effect. In our analysis, we typically choose B-splines of degree =3 and 10 intervals, and second order random walk priors on the B-splines regression coefficients.

Spatial Covariates

Consider first that the spatial index s ∈ {1, .., S} represents a location or site in connected geographical regions. It is assumed that neighboring sites that share boundaries are more homogenous than any other arbitrary sites. Therefore, for a valid prior definition a set of neighbors must be defined for each sites. Hence sitessandtare neighbors if they share a common bound- ary. Depending on the application, the spatial effect may be further split into a spatially correlated (structured) and an uncorrelated (unstructured) effect, i.e. fspat=fstr+funstr. A rationale is that a spatial effect is usually

a surrogate of many unobserved influential factors, some of them may obey a strong spatial structure while others may exist only locally. Besag, York and Mollie (1991) proposed a Markov random field prior for the correlated spatial effects fstr. The spatial smoothness prior of function evaluations

fstr,s|fstr,t, t6=s, τ2 ∼N  X t∈δs fstr,t N_s , τ2 str N_s  _, _(2.13)

whereNs is the number of adjacent sites and t∈δs denotes, that site fs is

a neighbor of site ft. Thus the (conditional) mean of fs is an unweighted

average of function evaluations of neighboring sites. Note that for spatial data conditioning is undirected since there is no natural ordering of different sitesf_s as in the case for metrical covariates.

In a general form, (2.13) can be given by

fstr,s|fstr,t, t6=s, τ2 ∼N  X t∈δs wst ws+fstr,t, τ2 str ws+  _, _(2.14)

where wsj are known equal weights and ws+ denotes the marginal sum of

wst over the missing subscript. Such a prior is called a Gaussian intrin-

sic autoregression. For more details, see Besag et al. (1991), Besag and Kooperberg (1995).

The design matrixXstr is an×S incidence matrix whose entry in thei-th

row ands-th column is equal to one if observation i has been observed at locationsand zero otherwise.

For the uncorrelated effect, we assume i.i.d. Gaussian random effects, i.e.

funstr(s)∼N(0, τunstr2 ) s= 1, .., S

Formally, the priors for fstr and funstr can both be brought into the form

(2.10). Forfstr, the elements of K given by

kss=ws+ and k_st = ( wst =−1 where t∈δs 0 otherwise

In document Análisis de la identidad corporativa de la Universidad Tecnológica de Pereira y su repercusión en sus públicos internos (página 71-76)