• No se han encontrado resultados

2.4 Ejecución de la evaluación ambiental

2.4.2 Problemática Ambiental existente

We have shown in the previous section that under regularity conditions of Theorems

10,11, and12, the AirMCMC algorithm produces a chain with various convergence properties. However, without any additional assumptions the chain might fail to converge in distribution, as we demonstrate in Example1 below. On the other, we show in Theorem13 that imposing an additional diminishing adaptation condition

(C1), guarantees ergodicity (i.e., convergence in distribution) of the AirMCMC algorithm. We argue that this is a minor condition that either holds in practice or can be easily enforced. In Theorem 14 we introduce an AirMCMC Algorithm

18, where the sequence of increasing lags {nk} is randomised, which ensures the

diminishing adaptation condition.

Recall that the diminishing adaptation condition(C1)is a restriction on the adaptation size of the algorithm:

sup

x∈X

kPγn(x,·)−Pγn+1(x,·)kT V

P

−→0 as n→ ∞. (3.15)

The following theorem demonstrates that the regularity conditions of the pre- vious section together with the diminishing adaptation condition imply convergence in distribution of the AirMCMC algorithms.

Theorem 13. Suppose that the diminishing adaptation condition (3.15) and As- sumption 1 hold. Let also one of the following conditions hold

(a) Assumption 2 and for the corresponding drift function V, sup

x∈C

V(x)<∞ and

EV(X0)<∞;

(b) Assumption 4 and for the corresponding collection of drift functionsVγ,

sup

x∈C

Vγ(x)<∞ andEVγ(X0)<∞ for all γ ∈Γ;

(c) Assumption 3 and for the corresponding drift function V, EV(X0) <∞ and

every level setCd:={x|V(x)≤d} is a uniform small set, i.e., satisfies (3.8).

Then any AirMCMC Algorithm 4 (or, in case (b), any modified AirMCMC Algorithm 17, where the set B in the algorithm settings is such that for the corre- sponding drift functionsVγ, sup

x∈B

Vγ(x)<∞,γ ∈Γ) produces an ergodic chain{Xn}, i.e.,

kL(Xn)−πkT V →0,

where L(Xn) is the distribution law of Xn.

As argued inRoberts and Rosenthal[2007], the diminishing adaptation con- dition is not an issue in practice. The condition holds for many typical adaptive MCMC algorithms (e.g., as for the standard Adaptive Metropolis or Adaptive Gibbs Samplers). For the adaptive algorithms where the condition does not hold (e.g., as for KAMH Sejdinovic et al. [2014]) or it is hard to verify the condition, we could, nevertheless, easily modify the algorithms in order to enforce (3.15). For example, at the adaptation timesNi, we could flip a coin with success probabilitypi to decide whether to adapt the Markov kernel. If limpi

i→∞

= 0, then (3.15) holds. Notice that the sequencepi can decay arbitrarily slowly.

Alternatively, for the AirMCMC algorithms, we could allow the sequence of increasing lags{nk}to be random. More precisely, let sequence{n?k}be determinis-

tic that satisfies (3.14) for someβ >0. We could consider an AirMCMC Algorithm

4, where in Step 2 we set nk = n?k+ Uniform[0,bkκc] for some κ ∈ (0, β). Since,

{nk} satisfies (3.14), we could still prove the statements of Theorems10,11,12 for

this randomised version of the AirMCMC. Moreover, the resulting Algorithm 18

would be ergodic and satisfy the statements of Theorems 10, 11 or 12 under the corresponding regularity conditions. We summarise our observations in Theorem14

below.

Theorem 14. Consider settings of Theorem 10 (alternatively, of Theorem 11 or

Algorithm 18: Randomised AirMCMC Sampler

Set some initial values forX0 ∈ X;γ0 ∈Γ;γ :=γ0. Let{n?k}be an

increasing sequence of positive integers. Fix some δ∈(0,1). Set

k:= 1;n:= 0,n1:=n?1.

Beginning of the loop 1. Fori= 1, .., nk

1.1. sampleXn+i ∼Pγ(Xn+i−1,·);

1.2. given{X0, .., Xn+i, γ0, .., γn+i−1}updateγn+i according to some

adaptation rule.

2. Setn:=n+nk,k:=k+ 1, nk=n?k+ Uniform

h

0,bkcδi,γ :=γn.

Go toBeginning of the loop

Algorithm18(in case of the settings of Theorem11, we allow adaptations in Step2

to happen only if the chain hits the corresponding set B). Then the adaptive chain

{Xn} produced by the algorithm satisfies statements of Theorem 10 (alternatively,

of Theorem 11or 12, respectively).

Moreover, for any sequence of lags {n∗k}, the AirMCMC Algorithm 18 sat- isfies the diminishing adaptation condition (3.15). Under regularity conditions of Theorem 13 (in case of the settings (b) of the theorem, we allow adaptations to happen only if the chain hits the corresponding set B), the adaptive chain {Xn}

produced by the algorithm converges in distribution.

We conclude this section with a counterexample that demonstrates that an AirMCMC Algorithm 4 might fail to be ergodic (i.e., the corresponding adaptive chain does not converge in distribution) without the diminishing adaptation condi- tion.

Example 1. This example is a modified version of Example 4 of Roberts & Rosenthal Roberts and Rosenthal[2007]. Our goal is to construct an AirMCMC algorithm that satisfies conditions of Theorem10 but fails to be ergodic. Let X ={1,2,3,4}. For someε >0, define a target asπ({1}) :=ε,π({2}) :=ε3,π({3}) =π({4}) := 1−ε2−ε3. Forγ ∈Γ :={1,2}, letPγcorrespond to a Metropolis-Hastings kernel with proposals

Q1(x,·)∼Uniform{x−1, x+ 1}, Q2(x,·)∼Uniform{x−2, x−1, x+ 1, x+ 2}.

Qγ(Xn,·), with probability min

(

1,π(Yn+1)

π(Xn)

)

set Xn+1 := Yn+1, otherwise, reject

the proposal, i.e.,Xn+1 := Xn. If the proposal is outsideX, then we always reject

it. Consider the following adaptive Algorithm19.

Algorithm 19:AMCMC with the SLLN but failing convergence in dis- tribution

Start withX0 =X1=X2 = 1, γ0 =γ1 =γ2 = 1,k= 1.

Beginning of the loop 1. SampleX2k2+1∼Pγ

2k2(X2k2,·);

2. Ifγ2k2 = 1, then update as follows. IfX2k2+1 6=X2k2, i.e., the proposal is accepted,γ2(k+1)2 := 2. Otherwise, γ2(k+1)2 := 1;

Ifγ2k2 = 2, then setγ2(k+1)2 = 1 if X2k2+1= 1, otherwise, set γ2(k+1)2 = 2.

3. Forn∈N, 2k2+ 2≤n≤2(k+1)2 run a Markov chain with the kernel

2(k+1)2; 4. k:=k+ 1.

Go toBeginning of the loop

Proposition 8. KernelsPγ,γ ∈Γ satisfy the simultaneous minorisation Assump-

tion 1 and the simultaneous drift Assumption 2. Therefore, by virtue of Theorem

10, the SLLN holds for Algorithm 19. However, the adaptive chain produced by the algorithm does not converge in distribution for sufficiently smallε >0 in the setting of the underlying target distribution.

Proof of Proposition8. Algorithm19is designed in such a way, that Steps1and

2 “drift” the adaptive {Xn} chain away from the correct stationary distribution. Since the chain approaches the stationary distribution arbitrarily closely after Step

3, we conclude that at times 2k2 + 2,

L X2k2+2|X0, γ0 −π T V > δ

for some δ > 0. Therefore, {Xn} does not converge in distribution. We provide a

detailed proof of the proposition in Appendix A.

3.3

Comparison with available Adaptive MCMC theory

Documento similar