• No se han encontrado resultados

CAPÍTULO 3. RESULTADOS DEL SISTEMA

3.4 Conclusiones del capítulo

We restrict our attention to linear Multiregression Dynamic Models, where the error distributions are Gaussian and the column vectorFt(r)is a linear function ofxt(r)with dimensionpr×1. Under these assumptions, the MDM equations 1.4.3a,1.4.3b and 1.4.3c as outlined by Queen and Smith (1993) may be simplified so that we may consider each individual noder in terms of a univariate Dynamic Linear Model (DLM), as described by West and Harrison (1997). If each state matrix Gt(r) is a pr×pr identity matrix and the observation variance is assumed to be constant over time, the DLM equations are

Obs. equation Yt(r) =Ft(r)⊺θt(r) +vt(r) vt(r) ∼ N [0, φ(r)−1]

State equation θt(r) =θt−1(r) +wt(r) wt(r) ∼ N [0,Wt(r)]

Initial information θ0(r) ∣D0∼ N [m0(r),C0(r)].

At each time t, there is a pr×1 state vector θt(r). The pr×1 state error vector is denoted bywt(r)and follows a mean-zero multivariate normal distribution withpr×pr covariance matrix Wt(r). The observation variance is assumed to be normally- and independently-distributed with mean-zero and constant varianceφ(r)−1. At timet=0, any information known about the system may be represented in the initial information set D0. This may include, for notational convenience, the (known) values ofFt(r) for all t. Thepr×1 prior mean vectorm0(r) andpr×pr covariance matrixC0(r) must be

specifieda priori.

As the state varianceWt(r)is unknown, it is encoded through a scalar discount factor

δ(r) ∈ [0.5,1], such that

Wt(r) =

1−δ(r)

δ(r) Ct−1(r) (1.4.6)

where Ct−1(r) is the posterior variance of the state variableθt(r) at timet−1. From equation 1.4.6, it is straightforward to see that ifδ(r) =1,Wt(r) =0 for all time, and the corresponding model is static. Lower values ofδ(r)treat the state variance as some fraction of the posterior variance at the previous time point; while this fraction is fixed,

Ct−1(r) (and therefore Wt(r)) may vary over time.

The posterior variance then becomes the ‘prior’ variance Rt(r) at timet, that is,

Rt(r) =Ct−1(r) +Wt(r) =

Ct−1(r)

δ(r) .

The posterior variance Ct(r) is updated at each time point t using the most recent observation yt(r).

The variances in the DLM that we need to estimate, the prior variance Rt, the fore- cast variance Qt and the posterior variance Ct, may all be expressed as a product of

the observation variance (inverse precision) φ(r)−1 and a ‘starred scale-free’ variance parameter (West and Harrison, 1997, p.109), denoted by a ∗, i.e.

Rt(r) =φ(r)−1Rt∗(r) Qt(r) =φ(r)−1Q∗t(r) Ct(r) =φ(r)−1C∗t(r).

Defining ‘scale-free’ variances in this way allows for these variance expressions to be updated via the DLM updating equations without any knowledge of φ(r)−1.

Define Dt= {D0, y1(r), . . . , yt(r)}, this is the initial information and the set of obser- vations available up to and including time t. Denote the posterior mean for θt(r) at time t as mt(r), and the forecast mean at time t as ft(r). Then the system evolves according to

Posterior at time t−1 p[θt−1(r) ∣φ(r), Dt−1] ∼ N [mt−1(r), φ(r)−1C∗t−1(r)]

Prior at time t p[θt(r) ∣φ(r), Dt−1] ∼ N [mt−1(r), φ(r)−1R∗t(r)]

One-step forecast p[Yt(r) ∣φ(r), Dt−1] ∼ N [ft(r), φ(r)−1Q∗t(r)]

Posterior at time t p[θt(r) ∣φ(r), Dt] ∼ N [mt(r), φ(r)−1C∗t(r)]

with the parameters updated through

ft(r) =Ft(r)⊺mt−1(r) Q∗t(r) =Ft(r)⊺R∗t(r)Ft(r) +1 mt(r) =mt−1(r) + R∗t(r)Ft(r)[Yt(r) −ft(r)] Q∗t(r) C∗t(r) =R∗t(r) − R ∗ t(r)Ft(r)Ft(r)⊺R∗t(r) Q∗t(r) .

At t=t0, the prior on the precision is

p[φ(r) ∣D0] ∼ G (

n0(r)

2 ,

d0(r)

2 ) (1.4.8)

where G(

,

) denotes the gamma distribution with shape and rate parameters. The prior hyperparameters n0(r) and d0(r) must be specified a priori. Specification of

the hyperparameters will be discussed further in subsection 1.6.1. At any time t, the updated prior on the precision is

p[φ(r) ∣Dt] ∼ G (

nt(r) 2 ,

dt(r)

with the hyperparameters updated at each time point using

nt(r) =nt−1(r) +1

dt(r) =dt−1(r) + [

Yt(r) −ft(r)]2

Q∗t(r) .

At time t, the updated estimate for the observation variance is given by

St(r) =

1

E[φ(r) ∣Dt] =

dt(r)

nt(r)

Let T

(

,

) denote the t-distribution with degrees of freedom, and location and scale parameters. The final marginal distributions are then

Posterior at time t−1 p[θt−1(r) ∣Dt−1] ∼ Tnt−1(r)[mt−1(r),Ct−1(r)] (1.4.10a) Prior at time t p[θt(r) ∣Dt−1] ∼ Tnt−1(r)[mt−1(r),Rt(r)] (1.4.10b) One-step forecast p[Yt(r) ∣Dt−1] ∼ Tnt−1(r)[ft(r), Qt(r)] (1.4.10c)

Posterior at time t p[θt(r) ∣Dt] ∼ Tnt(r)[mt(r),Ct(r)]. (1.4.10d)

The estimates for the scale parameters are

Rt(r) =St−1(r)R∗t(r) Qt(r) =St−1(r)Q∗t(r) Ct(r) =St(r)C∗t(r).

Retrospective Distributions

Equations 1.4.10b and 1.4.10c give the one-step ahead forecast distributions for θt(r) and Yt(r). The one-step forecast for Yt(r) provides a simple, closed-form formula for the likelihood stated in equation 1.6.1 while θt(r) estimates the strength of the regressors (the parent nodes) at time t given data y1(r), . . . , yt(r). When examining the behaviour of θ(r) over time, it is informative to consider not only the one-step estimates, but also retrospective estimates, {θT(r),θT−1(r), . . . ,θ1(r)} given all the

data, y(r) = {y1(r), . . . , yT(r)}. These may be obtained in a similar, one-step manner via the recursive relations outlined below. In order to maintain the notation used by West and Harrison (1997), the (r) notation is dropped temporarily so that θt(r) =

θt, φ(r) = φ etc. Then the bracket notation denotes the parameters k steps back in time. We have p(θt−k∣Dt) ∼ Tnt[at(−k), St St−k R∗t(−k)] k≥0. (1.4.11)

The parameters of this distribution may be obtained using the recursive relations

at(−k) =mt−k+Bt−k[at(−k+1) −mt−k] at(0) =mt

Rt(−k) =Ct−k+Bt−k[Rt(−k+1) −Rt−k+1]Bt−k Rt(0) =Ct (1.4.12a) where

Bt=CtR−t+11.

Note thatCtR−t+11=φ−1φC∗t(Rt∗+1)−1 and R∗t(0) =C∗t. For unknown variance φ−1, we may write equation 1.4.12a in terms of St, its best estimate at time t:

StR∗t(−k) =St−kC∗t−k+Bt−k[StR∗t(−k+1) −St−kR∗t−k+1]Bt−k =St−k[Ct∗−k+Bt−k[

St

St−k

R∗t(−k+1) −R∗tk+1]Bt−k].

Dynamic Linear Model theory is outlined in detail in West and Harrison (1997, Chapter 4).

Using these relations, it is possible to construct

p[θt(r) ∣y(r)] ∼ TnT(r)[µt(r),Σt(r)] (1.4.13)

with

µt(r) =mt(r) +Ct(r)Rt+1(r)−1[µt+1(r) −mt(r)] (1.4.14a) Σ∗t(r) =C∗t(r) +C∗t(r)R∗t+1(r)−1[Σt+1(r) −R∗t+1(r)]Ct∗(r)R∗t+1(r)−1 (1.4.14b)

Σt(r) =ST(r)Σ∗t(r). (1.4.14c)

In this work, we use mt(r) and Ct(r) to denote the parameters of equation 1.4.10d (that is, estimates for θt(r)given the observations up until timet). We use µt(r)and Σt(r) to denote the parameters of equation 1.4.11 (estimates for θt(r) given all the datay(r)).

Documento similar