CAPITULO 2 MARCO CONCEPTUAL
2.2 Auge de la economía popular y solidaria en Ecuador
Our thinning-based construction outlined in the previous section simplifies the structure of the sMJP posterior, and allows us to now define an auxiliary variable Gibbs sampler on the augmented space (V, L, W ). As in previous chapters, we alternately resample the thinned events given the current trajectory of the sMJP, and then the sMJP trajectory given the union of the thinned events with transition times of the previous trajectory, setting up a Markov chain over the thinned representation (V, L, W ) of the sMJP. We describe both operations in detail below.
6.4.1 Resampling the thinned events given the sMJP trajectory
Let (S, T ) be the current sMJP trajectory. We need to resample the thinned events (we called this set ˜W ) to recover the thinned representation (V, L, W ). Note that each thinned event ˜wi ∈ ˜W in the interval (ti, ti+1) has a corresponding label (˜vi, ˜li) equal to (si, ˜wi− ti).
Posterior inference via MCMC 102
taneous dominating hazard function U (t) as
A(t) = AS(t)(L(t)) (6.24)
U (t) = US(t)(L(t)) (6.25)
The black and coloured curves in figure 6.2 show these quantities. Observe that the sMJP trajectory completely determines these hazard rates. Loosely speaking, we can view the set of events W as a sample from a Poisson process with intensity U (t), and the actual set of transition times T as a subset of W sampled from a Poisson process with rate A(t) (see figure 6.2). Corollary 2.1 for the thinning theorem then suggests that we can recover the thinned events ˜W by sampling from a Poisson process with intensity (U (t) − A(t)). The following proposition shows that this is indeed the case.
Proposition 6.2. Conditioned on a trajectory (S, T ) of the sMJP, the thinned events ˜
W are distributed as a Poisson process with intensity U (t) − A(t).
Proof. We will consider the interval of time [ti, ti+1], so that the sMJP entered state si at time ti, and remained there until time ti+1, when it transitioned to state si+1. Exploiting the independence properties of the sMJP and the Poisson process, we only need to consider resampling thinned events on this interval. Call this set of thinned events ˜W ≡ { ˜w1, · · · , ˜wn−1} ∈ [ti, ti+1], and call the corresponding set of labels ˜V ≡ {˜v1, · · · , ˜vn−1} and ˜L ≡ {˜l1, · · · , ˜ln−1} (to avoid notational clutter, we do not indicate that ˜W and ˜L are actually restrictions to [ti, ti+1]). Observe that each element of ˜
vj ∈ ˜V equals si, while each element ˜lj ∈ ˜L equals ˜wj− ti. We write this as ˜V = si and ˜
L = ˜W − ti. Then, by Bayes rule, with equations (6.19) and (6.15) as the joint and marginal, we have P ( ˜W , ˜V = si, ˜L = ˜W − ti|si, ti, si+1, ti+1) (6.26) = P ( ˜W , ˜V = si, ˜L = ˜W − ti, vn= si+1, wn= ti+1, ln= 0|v0 = si, w0 = ti, l0 = 0) P (si+1, ti+1|si, ti) = exp−Rti+1 ti U (τ )dτ Qn−1 k=1(U ( ˜wk) − A( ˜wk)) Asisi+1(ti+1− ti)
Asisi+1(ti+1− ti) exp −Rti+1 ti A(τ )dτ = exp − Z ti+1 ti U (τ ) − A(τ )dτ n−1 Y k=1 (U (vk) − A(vk)) !
This is just the density of a Poisson process on (ti, ti+1) with intensity (U (t) − A(t)), which is what we set out to prove.
Observe that this step is independent of any observations. Sampling from the Poisson process is relatively straightforward by choosing the bounding rates Ui appropriately; we provide a concrete example in section 6.5. The hazard functions A(t) and U (t) remain unchanged at the end of this step.
Posterior inference via MCMC 103
Figure 6.4: Resampling the sMJP trajectory: observe that a new trajectory results in a new bounding rate U (t), and we need to account for the probability of the Poisson events W under this rate function.
6.4.2 Resampling the sMJP trajectory given the set of times W
This step is a bit more subtle than with the MJP. Like chapter 3, we want to assign each element wi∈ W a label (vi, li), by running the forward filtering backward sampling
Posterior inference via MCMC 104
algorithm over the set of times in W . Observe however, that li can take values in the set {0, wi− wi−1, · · · , wi− w0}, so that the dimensionality of the state space at step i is i (and thus increases with |W |). Consequently, the forward-backward algorithm for the sMJP is more expensive than for the MJP. The usual N2C scaling of the forward- backward algorithm (N being the number of states and C being the length of the chain) would suggest a computational cost that scales cubically with |W |. Note, however, that li can only equal 0 or li−1+∆wi−1. This sparsity in the possible state transitions results only in a quadratic scaling (at worst).
Next, observe from figure 6.4 that changing the sMJP trajectory results in a change in the instantaneous hazard functions A(t) and U (t). This is a consequence of the fact that unlike uniformization, candidate jump times are now drawn from a point process whose intensity depends on the sMJP trajectory. A new trajectory results in a new hazard function, and we need to account for the probability of the events in W under this new hazard function. It is however straightforward to adapt the forward-backward sampling algorithm to make this correction; we effectively treat the elements of W as additional ‘observations’ of the state of the Markov chain. During the forward filtering pass, as we calculate the probability of being in a particular state (vi, li) over an interval (wi, wi+1), we also include the probability of waiting for a time ∆wi= (wi+1− wi) until the next event under the resulting hazard function Uvi(τ + li). Write this probability as P (wi+1|wi, vi, li), it is given by P (wi+1|wi, vi, li) = Uvi(li+ ∆wi) exp − Z (li+∆wi) li Uvi(τ )dτ ! (6.27)
When running the forward-backward algorithm, we must also include this term in our calculations.
Figure 6.3 provides a graphical demonstration of the overall discrete-time system we have to solve. It includes observations X of the sMJP state, with xi representing all observations in the interval (wi, wi+1). Let P (xi|vi) be the corresponding likelihood function. Then, the joint distribution over the entire set of variables factorizes as:
P (V, L, W, X) = P (v0, l0, w0) |W |−1
Y i=0
P (xi|vi)P (wi+1|vi, li)P (vi+1, li+1|vi, li, ∆wi) (6.28)
Observe also that w|W | = tend does not correspond to a real event, rather it is the end of the observation interval. Consequently, while for i < (|W | − 1), P (wi+1|wl, vi, li) is given byequation (6.28), we also have that
P (w|W ||w|W |−1, v|W |−1, l|W |−1) = exp − Z (l|W |−1+∆w|W |−1) (l|W |−1) Uv|W |−1(τ )dτ ! (6.29)
Calculations for an sMJP with Weibull hazards 105
calculating the probabilities P (vi, li, w1:i+1, x1:i) using the recursion:
P (vi, li, w1:i+1,x1:i) = P (xi|vi)P (wi+1|vi, li) (6.30) X
vi−1,li−1
P (vi, li|vi−1, li−1, ∆wi)P (vi−1, li−1, w1:i, x1:i−1)
The transition probabilities P (vi, li|vi−1, li−1) are given in equations(6.15)and (6.17), with the probabilities of all other state transitions equal to 0. In the summation above, vi and vi−1 take values in S. Additionally, li either equals 0 or li−1+ ∆wi−1, while li−1 takes values in {0, wi−1− wi−2, · · · , wi−1− w0}. Thus, the ith step of the forward filtering stage scales as O(N2i). Since there are |W | such updates, the overall iteration of the MCMC sampler scales as O(N2|W |2).