Capítulo III. Análisis de los resultados obtenidos en la investigación sobre la cooperación
3.3 Resultados obtenidos en investigación
3.3.2 Influencia de la diversidad de ingresos en la cooperación comunitaria en el CDR #
points inT._?;N (under the
ncp
) actually have to be sampled. Thus, the algorithm can also deal with models for whichDl./is not bounded but only almost-surely finite.The key reason for why this sampling scheme is often preferable to Algorithm 5.1 is that in Step 1, we are not fixing the points under our actual target distribution . Instead, is a function of and z. This
will often allow greater movement in the-direction in Step 2.
Even more so as pointed out by Roberts et al. (2004), Griffin and Steel (2006), the particular
ncp
outlined above has another advantage: a de- crease in D l. /coincides with the removal of those points fromthat are associated with the smallest jump sizes. Similarly, an increase incoincides with adding points to that have relatively small jumps.
The above construction is therefore termeddependent thinning in Griffin
and Steel (2006). Usually, adding or removing a single small jump has little impact on the posterior density. This property can further facilitate movement in the-direction compared the other
ncp
s from Roberts et al.(2004) which add or remove jumps with arbitrary jump size.
5.3
Non-Centred Particle Gibbs Sampler
5.3.1
Motivation
The
ncp
discussed in the previous section can help reduce the impact of correlationbetweenand on the efficiency of Algorithm 5.3 but itdoes not alleviate inefficiencies resulting from the correlation between individual components of if these are still updated one-at-a-time.
A strategy for systematically updating in one block is offered by the
conditional sequential Monte Carlo(
csmc
) kernels introduced by Andrieuet al. (2010) and described in Section 3.4 of this work. Simplesequential Monte Carlo(
smc
) algorithms have been applied to latent point processesin Godsill and Vermaak (2004), Chopin et al. (2013). More sophisticated
smc
algorithms based around thesmc
-sampler framework (Del Moral et al., 2006b) have been developed in Del Moral et al. (2007), Whiteley et al. (2011), Martin et al. (2013) and in Chapter 4 of this work. As pointed out in Whiteley et al. (2011) and further analysed in Chapter 4, the latterclass of
smc
algorithms can introduce a substantial bias in the case of exponentially-distributed interjump times (as is the case here).We therefore employ simple
smc
algorithms even though these are potentially very inefficient (in the sense that sample impoverishment is severe). Oursmc
algorithm slightly differs from that described in Chopin et al. (2013) in two ways. Firstly, following Chopin (2002), we allow for more than one observation to be included persmc
step in order to speed up the algorithm. Secondly, we employ a slightly different parametrisation which permits the use of the variance-reduction techniques:backward sampling (bs
) andancestor sampling (as
) (Whiteley, 2010; Lindsten et al.,2012) within
pg
samplers. These were described in Section 3.4.Alternatives. There are, of course, alternatives to
pg
samplers for con-ducting inference in the models described here. For instance, we could use
smc
-based pseudo-marginalmh
algorithms known asparticle mar- ginal Metropolis–Hastings (pmmh
) algorithms (Andrieu et al., 2010) orpseudo-marginal
smc
algorithms based aroundpmmh
updates known assmc
-squared(Chopin et al., 2013).By construction, these methods are robust to strong correlation of
and under. However, these methods tend to require large numbers
of particles. For instance, Chopin et al. (2013) report the need for around 500 to 3;000 particles for a moderately-long time series in the simplest
version of the Lévy-driven stochastic volatility model discussed in Sec- tion 5.4. We have found such numbers of particles to be prohibitively high for implementations in high-level programming languages such as R (R Development Core Team, 2014) or Matlab (The MathWorks, Inc., 2015). With smaller numbers of particles, pseudo-marginal
mh
kernels are well known to suffer from the so called ‘stickiness’ problem, i.e. from long periods of high rejection rates.5.3.2
Conditional
smc
Kernel
In this subsection, we describe some of the details of the (conditional)
smc
algorithms which are needed to deal with the specific class of models analysed here (and in the previous chapter).Step Size. ForI 2N we let 0Dt0 < t1 < : : : < tI DT. Here,.ti 1; ti
5.3 Non-Centred Particle Gibbs Sampler
ith
smc
step. That is, at theithsmc
step, we both assimilate observationsand propose jumps in the interval.ti 1; ti.
Without loss of generality and to simplify the presentation, we assume that.ti/i2NI is a subsequence of.tQp/p2NP. Note that the commonly-used
strategy of assimilating one observation per
smc
step corresponds to the special case.ti/i2NI D.tQp/p2NP.If the weights do not deteriorate too quickly over
smc
steps, i.e. if the effective sample size does not decrease too steeply after a singlesmc
step, it can be preferable to increase this step size to reduce the computational cost of the algorithm (Chopin, 2002).Reparametrisation. For the
csmc
kernel, we need to apply a furtherreparametrisation to ensure that the computational cost of performing a single step of
bs
oras
does not grow withT (on average). This can beachieved by parametrising the compound Poisson process not in terms of jump sizes but in terms of the values of the process at the jump times. The latter coincides with the representation used in the previous chapter. Recall that the compound Poisson process is denotedLD.Lt/t2T. We
can apply another one-to-one reparametrisation of the form
.; / !.; P1WI/; (5.3)
where
Pi WD.Ki; Si;1WKi;Lzi;1WKi/
denotes the points (as well as their number) of the latent compound Pois- son process whose first component (the jump time) falls in the interval
.ti 1; ti. These points are again ordered according to their first compon-
ent, i.e. ti 1< Si;1< : : : < Si;Ki ti. The second components, Lzi;1WKi,
no longer represent the actual jump sizes but are now taken to be the values of the compound Poisson processLat the jump times. That is,
z
Li;j WDLSi;j, for anyj 2NKi. This is corresponds to the terminology
‘jump size’ used in the previous chapter.
With this reparametrisation, we may write the distribution targeted by the (conditional)
smc
algorithm as.dP1WI//.dP1WI/, where.dP1WI/WD I Y iD1 P ˘i.dPij P1Wi 1/g.y.ti 1;tij P1Wi; y.t0;ti 1/:
Here, ˘Pi.dPij P1Wi 1/denotes the conditional prior distribution of the
points in the interval.ti 1; ti, Pi, and we again slightly abuse notation by
writingg.yTj P1WI/Dg.yTj /if and P are related as in Equation 5.3.
Note that the observations taken in disjoint intervals are not necessarily assumed to be independent given the
ppp
and given. Indeed, in theexample considered in Section 5.4, we analytically integrate out a subset of the static parameters which means that the observations in disjoint intervals are no longer conditionally independent given the
ppp
and given the remaining parameters.The
smc
algorithm then targets.d /P using a sequence of interme-diate distributions i.dP1Wi//i.dP1Wi/; where i.dP1Wi/WD i Y jD1 P ˘j.dPjj P1Wj 1/g.y.tj 1;tjj P1Wj; y.t0;tj 1/:
5.3.3
Full Algorithm
The full
pg
sampler is outlined in Algorithm 5.5. Note that the comments made in Remark 5.4 fully apply to this algorithm, too. That is,Dl./only needs to be almost-surely bounded.
5.5 Algorithm (non-centred particle Gibbs).
(1) Update z by // using the
cp
(i) reparametrising.; /!.; P1WI/,
(ii) updating P using a
csmc
kernel (withbs
/as
, if possible), (iii) reparametrising.; P1WI/!.; /,(iv) sampling y O.dOj; /D z˘j
T.;N .d /O .