SECRETARIA DE COMUNICACIONES Y TRANSPORTES CENTRO SCT PUEBLA

M-H type kernels make proposals for the next state of the Markov chain using a proposal distribution, whose conditional density is denoted by q(y | x). In the standard M-H sampling, given the current state X(i), the proposal Y drawn from Y ∼ q(· | X(i)) is accepted with the probability

α(Y, X(i)) := min

1, π(Y ) q(X

(i)_{| Y )}

π(X(i)_{) q(Y | X}(i)₎

This is algorithmically implemented by drawing a uniform random variable Λ ∼ unif(0, 1) and accepting the proposal if and only if Λ < _π(Xπ(Y ) q(X(i)_{) q(Y | X}(i)| Y )(i)₎. When accepted, we take X(i+1) ← Y and when rejected, X(i+1) _{← X}(i)_{. This acceptance}

probability ensures that the detailed balance equations hold for the Markov chain X(i)_{and its stationary distribution is ¯}_π.

In multiple proposal Metropolis-Hastings algorithms, we make subsequent proposals from the rejected values. The number of sequential proposals we make, denoted by N , can be any fixed or random number, provided that it is independent of the proposal draws and the decision of whether the proposals are acceptable or not. Each of these N proposals are deemed either acceptable or not, and we take the L-th acceptable value as the next state of the Markov chain. If there are less than L acceptable values among the N proposals, the next state of the Markov chain remains the same as the current state. The number L can be fixed or random, and can be jointly drawn with N .

Throughout this chapter, for two integers n and m, we will denote by n : m the sequence (n, n+1, . . . , m) if n ≤ m and the sequence (n, n−1, . . . , m) if n > m. Also, given a sequence (an)n∈Z+ = (a₁, a₂, . . . ), we will denote by a_n:m the subsequence

Algorithm 3: Multiple proposal Metropolis Hasting algorithm

Input : The distribution of the maximum number of proposals and the maximum number of accepted proposals ν(N, L)

Proposal kernels {qn(yn| yn−1:0)}

Number of iterations, M Output: Markov chain X(i)

i∈1:M

Initialize: Set X(0) _arbitrarily

for i ← 0 : M −1 do Draw (N, L) ∼ ν(·, ·) Draw Λ ∼ unif(0, 1) Set X(i+1)← X(i)

Set na ← 0

for n ← 1 : N do

Draw Yn ∼ qn(· | Yn−1:0), where we understand Y0:= X(i)

if Λ <π(Yn){ Qn j=2qn−j+1(Yj−1| Yj:n)}qn(X(i)| Y1:n) π(X)q1(Y1| X(i)){Qj=2n qj(Yj| Yj−1:1,X(i))} then na ← na+ 1 if na = L then Set X(i+1)← Yn Break end end end (aj)n≤j≤m.

The algorithm starts at an arbitrarily chosen initial state X(0). We denote by X(i) the state of the Markov chain after i updates. Let N, L ∈ Z+ := {1, 2, . . . } with N ≥ L be drawn from a distribution whose probability mass function is denoted by ν(N, L). The algorithm draws Λ ∼ unif(0, 1), independently of N and L. The algorithm draws the first proposal Y1 ∼ q1(· | X(i)), independently of Λ and N . The

proposal Y1 is called acceptable if Λ <

π(Y1)q1(X(i)| Y1)

π(X(i)_)q

1(Y1| X(i)). The second proposal Y2 ∼ q2(· | Y1, X(i)) is drawn given the value of Y1 and X(i). The proposal Y2 is acceptable

if Λ < π(Y2)q1(Y1| Y2)q2(X(i)| Y1,Y2)

π(X(i)_)q

1(Y1| X(i))q2(Y2| Y1,X(i)). The n-th proposal Yn, n ≤ N is drawn from qn(· | Yn−1:1, X(i)) and called acceptable if

(4.1) Λ < π(Yn) n Qn j=2qn−j+1(Yj−1| Yj:n) o qn(X(i)| Y1:n) π(X)q1(Y1| X(i)) n Qn j=2qj(Yj| Yj−1:1, X(i)) o .

The procedure is repeated until L acceptable proposals are drawn or until N proposals are drawn, whichever comes sooner. The next state of the Markov chain X(i+1)

is set to the L-th accepted value, or to X(i) _{if there are less than L acceptable values}

among Y1, . . . , YN. The pseudocode for this algorithm is shown in Algorithm 3. The

standard Metropolis-Hastings algorithm corresponds to the case where N and L are both equal to 1.

We note that the proposal kernels qn, n ∈ 1 : N , can be simply taken as

qn(yn| yn−1:0) ≡ q(yn| yn−1)

for some proposal kernel density q(· | ·). In addition, if the kernel q is symmetric, that is, if q(x | y) ≡ q(y | x), then the acceptability criterion (4.1) simplifies into

Λ < π(Yn) π(X(i)₎.

We note that the multiple proposal Metropolis-Hastings algorithm with L = 1 constructs Markov chains with the same distribution as those constructed by delayed rejection (DR) methods [Tierney and Mira,1999,Mira et al.,2001, Green and Mira,

2001]. The proof of this claim as well as a brief description of the delayed rejection method is provided in appendix (Section 4.A). However, the framework we present has several advantages over the delayed rejection:

1. First, our algorithmic framework is conceptually and algorithmically simpler. The expression for the acceptance rule is more concise than that given in the original papers on delayed rejection. Our framework provides a new perspective on why the rather convoluted acceptance probability formula in [Mira et al.,

2001] is necessary.

2. Our framework is more broadly applicable than the delayed rejection method. For example, some MCMC algorithms, such as Hamiltonian Monte Carlo or the bouncy particle sampler methods, use piecewise deterministic kernels to draw

proposals. For these methods, the application of the delayed rejection is not straightforward. However, our framework can be applied to these methods.

3. Our framework is more general than the delayed rejection method. As men- tioned earlier, when we use random proposal kernels with well defined densities, the delayed rejection approach is identical to the case where we take L = 1 in our framework, in terms of the law of the Markov chain.

In document HOSPITAL GENERAL DR. MANUEL GEA GONZALEZ SUBDIRECCION DE RECURSOS MATERIALES CONVOCATORIA 005 (página 115-117)