2.2. Marco Conceptual
2.2.1.11. Trastornos Músculo esqueléticos
2.2.1.11.1. Principales lesiones musculo-esqueléticas
£(vec(«p)| a , T, 6 , 1 data) = [Q„+Qp] '[h rf+hp],
(10) var(vec(<p)| a , T, 5, E data) = [Qj+Q^,]1,
where:
Qd= (a'I'a)® (Z 2'Z2), Qp= coJ,r, hd = vec(Z2'WZ-'a), h = c o ^
W=AY-Z,a'-YT-D8, Y., = [Z,|Z2], with Z, a (pxr) matrix .
Proof: from the joint posterior pdf (9), the conditional posterior pdf of q> can be obtained as:
p(<p|a,r,5,E data) x exp{- 5[trace(E'EE-1 +%(vec(q>)-M¥)'(vec(<P)-M*)]}.
Straightforward algebra leads then to the required result ■
The justification of this result is straightforward: the combination of multivariate normal data evidence for vec(q>) and of multivariate normal prior information on it
generates a normal conditional posterior distribution whose moments are functions o f data and prior moments.
A special case attains when the prior precision parameter is zero, i.e. in the absence o f any a priori information. In this case, the conditional moments are:
E(vec(<p)| a , T, 8 , 1 data) = [QJ->h, =vec[(Z2,Z 2) 'Z 2'WZ >o(aT 'a)-'],
(11) var(vec(<p)| a , T, 8, Z data) = [Q J > = (a 'Z 'a )-'® (Z 2'Z2)->,
Notice that in this case, when a> is equal to zero, the precision matrix o f the conditional posterior pdf of <p is not invertible when a has less than full rank. When <um * 0, this case never arises.
The result contained in this lemma is important because it characterizes the marginal posterior distribution of the cointegration parameters contained in the matrix <p In fact, it is immediate to notice that:
p(vec(<p)| data) = J J J J/K<Pl“ r 8 Z data) p {a T 8 Z\data)da dT db d l .
(12)
What this expression says is that the finite sample marginal posterior distribution of <p is the average of multivariate normal distributions, weighted by the marginal distribution of the other parameters of the system. In other words they are mixtures of normals. From the viewpoint of the classical inference literature, this result has been shown by Johansen (1991, 1995a) to hold for the asymptotic distributions, and Phillips (1994) shows that the reduced rank regression cointegrating vectors have finite sample distributions with Cauchy tails and no moments. In this Bayesian
framework, the behaviour of p(y\data) does not present Cauchy tails when a
proper prior for <p is specified.
The conditional posterior distribution of <p can be easily simulated
Lemma 4.2: The conditional posterior distribution of 02=[vec(a ')’|vec(r)'|vec(8)']’,
is (rxs+p2x(k-\)+dxp)-\aha.te normal, with moments:
£ (0 2|q>, Z, data) “ [Rrf+RpWfa+g,], var(02|<p, Z, data) = [^+ 1^,]-',
where:
R„=S'[Z->®(X'X)]S, Rp= » J ^ + V + Z ,- 1.
fa = S' vec(X'AYZ->), g p = taana+Z8 >n8, X - [Y.,p|Y*|D],
and S is a permutation matrix such that:
vec{[a| r | 5 ' ] ’}=S02
Proof: starting from (9), the conditional posterior pdf o f0 2can be obtained as:
/>(02|<P,Z data) x exp{-5[trace (E'EZ-1 +a>a(vec(a)-n<1)'(vec(®)-mi)
+vec(T) 'Z,--' vec(r)+(5 -p,)]} .
The intuition of this result is exactly in the same terms as in Lemma 4.1 On the basis of this result and of the properties of the multivariate Gaussian distribution, it is immediate to see that the conditional distributions p(vec(a ')|r, 8, <p, E, data),
p(vec(T)\cL, 8, <p, E, data), p(vec(8)|a, T, <p, E, data) are likewise normal
A special case attains in the absence o f any a priori information about 02, i.e. when
coa =0, Ej--1 = [0], Eg-' = [0] In this case, the conditional moments are:
£ (0 2|«p, E, data) = R j 'g , = S‘>{veC[(X'X)-‘X'AY]},
var(02|q>, E, data) = S 1[E®(X,X)-1] S 1'.
The conditional posterior distribution o f0 2 can be easily simulated
Lemma 4.3: The conditional posterior distribution o f E, />(vec/t(E)|a,<p,r,8, data)
is inverted Wishart
Proof: starting from (9), the conditional posterior pdf o f E is:
p (E |a , <p, T, 8 data)oc |E| <T+n+\yi exp{-.5[trace (E' E E 1]}.
which can be recognised as inverted Wishart ■
Such a distribution can be easily simulated: exploiting the properties of the Wishart distribution, one can easily draw from multivariate normal distributions, map this draw onto a draw from a Wishart distribution, and this latter one is mapped onto a draw for an inverted Wishart distribution, as required.
On the basis of these results, it is possible to generate as many draws from the marginal posterior pdfs as desired, and to put them in a Gibbs sampling sequence
which defines a Markov updating scheme This scheme converges in distribution to the joint posterior pdf given that the conditions on the conditional pdfs described in Section [3.3 c] for achieving convergence are satisfied
Being able to generate draws on this distribution, it is possible to estimate the posterior expectation (if it exists) of any well defined function of the parameters, and the marginal posterior distributions of any subset of parameters of interest. These estimates are obtained on the basis of the Monte Carlo principle, to any desired degree o f accuracy:
/„ (© ) = AT1 £ /( 0 < ’>) °^E (g{Q \data)
i-i
In order to obtain a Monte Carlo numerical estimate o f the marginal posterior distribution of a certain subset of parameters, say 0, , the function^©) is defined as /?(0,|02 , . 0 * data)
p(Qx \data) = N '1 £ />(0, |0<° 0<° data) , i-i
whereas, in order to obtain the posterior moments o f such distribution one could defineJ[Q) as the corresponding conditional moment:
£ ( 0 , |data) s N'1£ £ ( 0 , | 0 (2° 0?data),
1« 1
var(0, \data) = AT1 £ v a r ( 0 ,| Q ^O ^data).
i-i
Due to the inherent correlation among draws in the Gibbs sample, the accuracy of the Monte Carlo estimates can be measured by means of heteroskedasticity- autocorrelation consistent (HAC) estimators of the standard error o f the sample
mean of /(0), based on a consistent estimate o f its spectral density function at frequency zero The simplest one, which delivers a well behaved estimate o f the standard error, is the Newey and West estimator reviewed in Section [3.3c] This estimator is used in the applications presented in this chapter Following Geweke (1992), I also evaluated a HAC diagnostic test to assess whether convergence of
the Gibbs Sampling scheme to the joint posterior distribution has occurred in the applications being presented in this chapter, testing the equality o f the sample mean o f a batch of early draws in the sequence and the sample mean of a batch o f late draws in the sequence Under the null of equality o f the two sub-sample means, the resulting test statistic has an asymptotic standardised normal distribution Acceptance of the null is interpreted as that the G SS has converged For the details
see Section [3.3.c].
[6.5] Inference on Cointegration Rank.
I now turn to the problem of how to conduct inference on the cointegration rank The model described in the previous sections can be cast in a different parameterisation which is based on the singular value decomposition of n = a f)' (see Dhrymes, 1978, p. 78):
n = u a v, ( n n ) u = u a2,u u = u u = i p ,
( i r n ) v = v a2, v v = w = i p
The matrix A is diagonal with the square root o f the eigenvalues o f ITU Under the assumption o f rank r < p , the singular value decomposition is:
with U, and V! (pxr) matrices and A, (rxr) diagonal matrix with the square roots
of the positive eigenvalues of HIT on the diagonal Thus the model can be equivalently written as:
F (£ )A y ( = U jA jV .'y,., + 8 'D r + e, (13)
Inference is then made on the number o f diagonal elements o f A, being different from zero. The joint posterior distribution of the model as in expression (9) can be simulated by means of the Gibbs Sampling scheme described earlier It is straightforward to map each draw on a. and P onto a draw on U,, A, and V, by applying the singular value decomposition to IT<'> = a (,)Pw\ In this way it is possible to obtain a Monte Carlo estimate o f the marginal posterior distribution o f A. =
diag{A,) and o f its moments, just by analytically characterizing the conditional
posterior distribution of X
This is done in the following lemma.
Lemma (5.1 ]: The conditional posterior distribution of X = diag(A,) has the
following kernel:
/ > m , V„ T, 6 , 1, data) oc |n|->*«p{-0.5[(X-Ti)«-'(X-Ti)]},
*1 = Cl"Ql2Q21 *CJ > ^ Qll ‘ Q12Q22 'Q21>
W = A Y - Y T - D 8 ,
and G is a permutation matrix such that diag{A,) is given by the first r rows of G
vec( A,).
Proof: considering the parameterisation (13), the conditional posterior o f A, can be obtained from (9) as:
p(A ,|U ,, V, r , 8 data) x exp{- 5[trace (E'E2/'+
(w c (U 1A 1V 11,)-na)'E0-'(vec(U1A 1V 1,’)- M j]}
Usual algebra gives the joint posterior of A,,and applying the standard factorisation results for a multivariate Normal proves the lemma ■
The conditional pdfs of the single elements of diag( A,) are obtained by taking into
consideration their nature as truncated normal distributions For instance, the conditional pdf o f the second element o f X, has support A, s (A,, A,), and can
be written as:
P (A, |A„ A,, U „ V, T, 8 data ) = * (A,| A,, A,, U „ V, T, 8 data)/
where <j> (A^l A,, Ai , U,, V, T, 5 data) is a Gaussian pdf conditioned on, A,, A,, U,,
V, T, 6, and <J>() is the corresponding cdf
On the basis o f this analytical result, which holds for whichever rank of II, from one to p, it is possible to conduct inference on the true cointegration rank, by
means o f the posterior distribution of X = diag{A,). In the present context, rank
equal to r is the maintained hypothesis In order to check whether the rank is equal
to r-1, one has to evaluate the posterior distribution of the r* element of X and see if zero falls within the highest posterior density confidence interval at a chosen confidence level (say 95%). This test has Johansen's X—max test as a classical
inference counterpart In order to see whether it is possible to reduce the rank from
r to r-2, one has to examine the joint posterior distribution o f the last two elements
of X, and when the test is carried out at r = p, this has Johansen's trace test as a
classical inference counterpart
[6.6] Testing Restrictions on the Cointegration Space
Once the cointegrating rank has been decided, it might be interesting to check restrictions on the free parameters in the cointegrating vectors We have already seen the lack o f identification problem that has to be solved by imposing a certain structure on the P matrix We choose to impose the normalisation P=[Ir, q>']'. Remember that this structure does not impose any restriction on the space spanned by the cointegrating vectors It is possible to impose restrictions on the cointegrating space, i e "overidentifying" restrictions on the columns of p of the kind:
The validity of these constraints can be tested by means of asymptotically x 2
distributed LR statistics (Johansen, 1995b)
As for the finite sample performances of these tests, no analytical result is available. Recently, Cappuccio and Lubian (1995) have shown via Monte Carlo simulation that the empirical size of those tests is dramatically different from the nominal one, leading to systematic over-rejection o f the maintained hypothesis also in fairly large sample sizes. For this reason, it is interesting to see what indications could be gathered by the use of Bayesian techniques based on finite sample evidence Writing the over-identifying restrictions in the following form:
R'vec(<p) = d, (14)
I define the variable 1; = R'vec(<p)-d, whose conditional posterior distribution can be readily obtained from lemma 4 1 as «-dimensional Normal with moments:
m ,a, r , 6, Z d a ta) = R 'tQ /K },] '[h ^ + ig -d ,
w t f ) | a , T, 5, Z data) = R [Q„+Qp] 'R
If (14) holds, one would expect £ to have posterior pdf with expected value equal
to zero Defining S S = it is therefore possible to write :
E(SS | data ) = trace[var(Q\ data]
Hence, on the basis o f a Gibbs sample from the joint posterior distribution o f a , T, 8, Z, one could at each pass evaluate (SS)t'\ and vor(£)| a f'\ P'>, 8<'>, Z<’> d a ta ), i
ÇN = Ar- ' l / r a c c [ v a r ^ | a (') , r <0,5 <,), E (' ) |<*ito)])
a Monte Carlo consistent estimate Gn o f £ At this point, it is suggested to accept
the hypothesis (14) at a desired confidence level, if the corresponding HPD for SS
contains the value .
Another testing strategy could be to evaluate the evaluate the "LM" statistic at
each pass o f the GSA as:
Also, measuring the distance of £ from zero with a different metric, one could simulate the "LR" test at each step in two different ways:
where £<o is the i-th draw from p(£|<*(,), «P01, HO, data ), £ ft<o is the i-th draw from />(£|a<o, R’<p<')= d, F ’>, 5<*>, data), is the ML estimate o f £ conditional on a<‘), <p0), p o , gd), and ' is the M L estimate o f £ conditional on a (,), R'q>W = d, HO, »0.
The desired level HPD confidence intervals could be evaluated for these three
statistics, and one could then check whether the value of q, which is the number o f
overidentifying restrictions actually being imposed falls within it or not Notice that
L M '>= ÇW [var(£ | a « , PO, fit», £(0 data)]'1 %». (15)
the validity o f the procedure is only asymptotical for the LR{ and LR2 statistics,
which are intended to provide only additional corroborating evidence to the tests based on the finite sample posterior distributions of and LM.
[6.7] Some Applications
In this section I present the results of four different applications of the technique described in the previous sections. The first application presented in this section is on a vector o f simulated data The main rationale behind this exercise is to gather information on how the procedure works, and on how the results obtained are precise, given perfect knowledge of the data generation process (DGP) The
second and the third applications are on the Danish and Finnish money demand applications analysed by Johansen and Juselius (1990) The fourth application is on
the PPP-UIP data for the UK studied in Johansen and Juselius (1992).
For all the applications I present the results o f the base case of complete ignorance priors (coa = <w, =0, 5V1=[0]) As for hyperparameter co^ , setting it to a value
different from zero will surely avoid local non-identification of <p, which would occur all the times a has deficient rank For this reason, in all the three applications on "real" data I implement different values for this hyperparameter, and I monitor the sensitivity o f the results in this respect.
In all the applications, the marginal posterior pdfs are obtained, when possible, via Monte Carlo integration of the corresponding conditional distributions, when the latter ones can be analytically computed In the remaining cases, i.e for the over identifying restriction test statistics, the marginal posterior pdfs have been obtained by using the Gaussian kernel method with plug-in bandwidth (see Silverman,
For all the applications described in this section, the Monte Carlo simulations have been carried out on the basis of a sample of 10,000 Gibbs sampling draws, after having discarded the first 500 passes. A Bartlett window with bandwidth equal to 9 has been implemented to obtain the standard errors of the Monte Carlo estimates
[6.7.1] A simulated data set example
The data generation process being used in the analysis is a very simple one:
Ay = a P ’y,.,+ep e,~N(0,£), a= [-0.3, -0.03], 3 = [1 ,-1 ]', E = i#ag{0 0 1,0 01},
where the sample size T is equal to 200, yt is obviously a bivariate I( 1) process
with zero mean differences This is the simplest possible framework, given the low dimensionality of the model, the total absence o f short run dynamics and of deterministic components The model being estimated is:
Ayr aP ’yM+6'D, +r,AyM +e(,
where D, is a deterministic vector containing a time trend and a set o f 4 seasonal dummies The "true" parameters 5 and T, are therefore zero in the simulated data generation process.
The results are presented in Table 6.1 below The individual parameters referred to in the table are defined as follows: a = [a , Oj ]', 3 = [1, / y \ A is the positive diagonal element of A under the assumption of cointegration rank one, <r, „ a n and
parameters associated with the deterministic component (four seasonal dummies
VV\\ Vn Vn V
L^ 2 1 ^ 2 2 ^ 2 3 ^ 2 4 ^ 2 5.
and a linear trend) are contained in 8' =
Table 6.1: Simulation results
true value post, mean est. HAC std error conv diagn.
«1 -0.3 -0.3225 0.0013 0.3027 Oh 0.03 -0.0013 0.0013 0.5065 07 -1.00 -0 9881 ~ - A 0.43 0.4579 0.0019 -0.2557 O’,, 0.01 0 0110 0.00003 0.8479 a-,7 0.00 0 0008 0.00002 0.0041 C T „ 0.01 0.0108 0.00003 -0.0564 Xu 0.00 -0.0460 0.0018 -0.0623 r,7 0.00 0 0009 0.0025 -0 4461 ?7\ 0.00 -0.0329 00018 0.0234 Yti 0.00 0 0305 0 0024 -0.0416 t/n 0.00 0 0292 0.0007 -0 1802 V\7 0.00 0.0038 0.0007 -0 1697 1/|3 0.00 -0.0134 0.0007 -0.5543 l/l4 0.00 -0.0129 0.0007 -0.2738 V'h 0.00 0 00008 0.000005 0.3754 ^21 0.00 -0.0182 0.0005 04801 V77 0.00 -0.0151 0.0005 0.1604 1/73 0.00 -0.0111 0.0005 0.3823 1/74 0 00 -0.0067 0.0005 0.3534 !Ùi__________ 0.00 0 00006 0.000003 -0.7380
Notes: The sample size being used is T=200. The hyperparameter co9 is set to zero, given that we surely do not have local identification problem here The posterior means reported are obtained as sample averages over the draws For the parameter the mode o f the posterior distribution is reported, given that the posterior expectation does not exist. The standard errors estimates are HAC in the Newey West specification with bandwidth =9 and Bartlett weights The convergence diagnostics are obtained by comparing the results of the first 10% and the last 10% o f the Gibbs sample of values for each one of the parameters.__________________________________________________ Looking at the convergence diagnostic results, it is immediately noticeable that none o f them is significant at the usual 5% size, and therefore it is possible to conclude that the Gibbs sampling scheme used in this analysis has reached convergence satisfactorily.
The results indicate that, even in the absence of prior information, it is possible to obtain quite precise information about the parameters of the model, in terms of their marginal posterior distribution and moments The parameters in a and P have
posterior means very close to their true values The HAC estimated standard errors
are very small, notably the ones associated with the linear trend coefficients, whose rate o f convergence is the fastest (Tia ). Together with the estimates of the
posterior mean of A as the Gibbs sample average o f A (0 4579) reported in the
table, we present a further estimate of it in terms of :
E(A\data) = AT1 > V ,°) T ° > 6 ° > 1 ° > data)= 0.45225.
j- >
The posterior distributions have been obtained for A, the main parameter of interest
o f the model, in terms of:
p{A\data) = N~' £ p (A \V \J > V ,°> T (; > 6°> Z (y} data).
7*1
This marginal posterior distribution is the key element for conducting inference on the cointegrating rank In fact it is possible to compare the rank one hypothesis with the rank zero hypothesis by constructing a highest posterior density confidence interval for A and see whether A=0 falls inside or outside that interval.
In the present context, the 95% HPD, obtained by means o f numerical quadrature
is [0.2, 0.7], It is therefore uncontroversial that the hypothesis being supported by posterior evidence is rank equal to one.
[6.7.2] The Danish Money Demand Example
VAR(2) model for the vector series y, = [LRM, LRY, IB, ID]', , where LRM is the
log of real M2, LRY is the log of real income, and IB and ID are the logs o f the
gross bond and deposit interest rates respectively. The Danish quarterly data run from 1974:1 to 1987:3. The results are collected in Tables 6.2.1 and 6.2.2, and Figures 6.2.1 to 6.2.5 contain the posterior pdfs of the relevant parameters. In Figure 6.2.1 I present the univariate posterior pdfs of the parameters A, and A7
obtained in a model where the cointegrating rank has been set equal to two A