1. EL PROBLEMA
1.4. OBJETIVOS DE LA INVESTIGACIÓN
2.1.10. TRÁMITE
The main difficulty in performing Bayesian inference in biological systems is how to formalise the likelihood, which relies on the calculation of the transition probability and therefore the solution of the CME. As we mentioned in section 3.9, it is not possible to solve CME analytically for most biological systems, which is required to
compute the likelihood in Bayesian inference. Several approaches have been pro- posed to solve the inference problem, such as MCMC and ABC. These approaches are based on statistical simulations. A different approximate inference method has been proposed based on variational inference (Murphy, 2012) and (Opper and San- guinetti, 2008). The construction of a variational approximation for CTMC, in particular, has been illustrated by Opper and Sanguinetti (2008).
The approach relies on picking an approximation from a family of distributions q(x) ∈ Q. It attempts to make the approximation as close as possible to the proba- bility distribution for the system π(x).
The main advantage of using the variational inference for Markov processes is speed as it is faster than MCMC algorithms. Variational inference can be performed with large data with different forms and scenarios, while MCMC requires intensive computational time even for small data sets. Therefore, there is a trade- off between accuracy and speed. MCMC provides an exact approximation, variational inference is suggested when speed is important.
Assume that π(x) is the true posterior distribution we are interested in but this distribution is intractable. Assume further that π(x) is approximated by q(x), which can be chosen from the tractable distribution family. The distribution q has free parameters which can be optimised
The posterior then can be approximated by minimising the KL divergence between two Markov processes, which are defined through their probabilities path π(x) and q(x) as:
KL(q(x), π(x)) =X
x
q(x)logq(x) π(x),
where π represents the posterior of the Markov process and q represents the approx- imate distribution.
One of the common choices of form of variational inference is the mean field approx- imation, which assumes that the posterior approximation has the form:
q(x) =Y
j=1
qj(xj).
Then, the aim is to find the member of that family which minimises the KL diver- gence to the true distribution,
Chapter 3. Bayesian Inference Methods 75
q∗(x) = min
q1,··· ,qQ
KL(q(x)|π(x)),
where each marginal distribution qj is optimised over its parameters. Then, the true
distribution is approximated with the optimised member of q∗(x).
More details about this method, how to perform Bayesian inference and applications can be found in (Opper and Saad, 2001), (Opper and Sanguinetti, 2008), (Cohn et al., 2009), (Murphy, 2012).
Pseudo-marginal approaches which rely on random truncation are recently developed methods in statistics (Filippone and Engler, 2015), (Lyne et al., 2015). Georgoulas et al. (2017) proposed a novel random truncation method for unbounded state spaces that ensure unbiasedness of the result as is detailed in section 3.8.
In addition, (Boys et al., 2008) evaluates various MCMC methods in different data scenarios and applied to the Lotka- Volterra system. The paper shows how inference can be made given a complete, regular and partially observed data sets. Based on the investigation (Boys et al., 2008), the partially observed data set case was more challenging compared to other scenarios.
3.10
Summary
In this chapter, an overview of Monte Carlo methods which use a set of samples to approximate the target distribution is provided. We began with rejection sampling and importance sampling. These methods can be inefficient in a complex model be- cause it requires a careful selection of suitable proposal distribution. In contrast, the SMC and MCMC can be more efficient to deal with complex and high dimensional problems. We have discussed these methods as a means of estimating the generally complicated target distribution.
In addition, these algorithms are described with reference to Bayesian inference in Markovian models. MCMC, and in particular the MH algorithm, is presented with an efficient adaptive scheme, based on the evolution of an unknown covariance matrix through the iterations. The main obstacle to the direct application of the standard MH algorithm is the performance of this MH algorithm mainly relies on the evaluation of the acceptance probability, which itself requires an explicit evaluation of the likelihood.
likelihood is estimated with SMC. The challenge is to formulate this approach with model involve latent variable which can limit it to their computational complexity. A recent effort that considered the inference in a model with intractable likelihood results in an exact inference method known as pseudo-marginal Gibbs approaches which reducing the computational cost investigated. The main advantage of Gibbs sampler is that there is no need to select a proposal distribution while in PMMH, the efficiency of this method is relying on the choice the proposal distribution.
Chapter 4
Approximate Bayesian Computation
4.1
Introduction
One of our main goals is to enable Bayesian parameter inference for CTMC models. The main problem in performing such inference lies in the lack of a closed form expression for the likelihood π(y|θ), because it depends on the transient probability of the latent states of the CTMC. Potentially, the state space of the CTMC model may be very large, and working out a transient probability of individual state will be infeasible if not impossible. A range of approximate inference methods exists that does not require an explicit evaluation of the likelihood, and employs simulation from the likelihood instead. We consider a family of the likelihood-free inference methods called Approximate Bayesian Computation (ABC) methods. In this section, we review state of the art in ABC.
In an attempt to provide the reader with a full understanding of the ABC, we begin with a brief overview of the methodology that was developed for a popular class of Bayesian inference algorithms and that demonstrated how the challenging problem could be solved.