• No se han encontrado resultados

2.4 Población y Muestra

2.4.3 El Muestreo

2.4.3.2 Cálculo de la muestra de la población para las

As outlined above, Bayesian inference rests on the specification of a prior distribution

beliefs about the parameter set Θ. The nonparametric development is to allow Θ to be infinite dimensional, yielding greater modelling flexibility at the cost of greater analytical and computational challenges. Typical nonparametric model sets include spaces of functions (e.g. Cb2(Ω) for some Ω ⊆ Rd) or measures (e.g. M

1([0,1])). I

will abuse terminology and refer to elements of such spaces as “parameters”, with the understanding that they may be infinite dimensional.

Givenn∈Ndata pointsx1:n, the central object of Bayesian inference is the

posterior distribution, defined for setsA⊂Θ as

Q(A|x1:n) := R APθ(x1:n)Q(dθ) R ΘPθ(x1:n)Q(dθ) . (1.16)

It is well known that in the Bayesian setting the posterior contains all the informa- tion about the parameter carried by the data. In the nonparametric setting neither existence nor uniqueness of the posterior is guaranteed. When a unique posterior does exist, it is given by (1.16). I will neglect issues of existence and uniqueness, and simply assume that a unique posterior exists.

The next consideration is computing the posterior. This can be done ana- lytically for so calledconjugate pairs of priors and likelihoods. Theses are pairs for which the prior and posterior both belong to the same family with altered parame- ters. Examples in the parametric case include beta priors and binomial likelihoods for estimating the binomial success probability, Gaussian priors and likelihoods for estimating the mean, and Gamma priors with Poisson likelihoods for estimating the rate. A nonparametric example is the Dirichlet process prior [Ferguson, 1973] with IID observations for estimating the unknown sampling distribution. Conjugate fam- ilies impose very restrictive assumptions on the prior, and are often not available, most prominently whenever the likelihood is intractable. When conjugate priors are not available or appropriate, the posterior typically has to be approximated numerically, using Monte Carlo or other methods. This setting is the focus of this thesis.

In addition to reflecting prior information, the prior distribution Q can be seen as specifying a model for learning parameters from data. From this perspective, it makes sense to ask that Q(·|x1:n) concentrates on the “true”, data generating

parameterθ0 asnincreases, reflecting the potential to learn the true model from a

sufficient amount of data. This property is known as posterior consistency, which can be stated more formally as

lim

for any ε > 0 and some norm k · k on Θ, for some appropriate norm, topology and mode of convergence. In the nonparametric setting, posterior consistency is an intricate property which depends in subtle ways on Θ and Q. However, it is also regarded as a minimal requirement for well justified Bayesian inference [Diaconis and Freedman, 1986]. The typical properties required of the prior and the parameter space for posterior consistency to hold are a prior mass condition, i.e. Qmust place sufficient mass in a neighbourhood of θ0 (and, in particular, not exclude it), and

regularity conditions to suitably limit the “size” of Θ.

Stronger, and more analytically demanding forms of posterior asymptotics consist of identification of contraction rates for consistent posteriors, and ultimately Bernstein-von Mises theorems:

sup

A∈B(Θ)

|Q(A|x1:n)−µ(ˆθ,Σ)(A)| →0,

whereµis a Gaussian measure on Θ (see Section 6.3 of [Dashti and Stuart, 2016] for an overview of Gaussian measures on infinite dimensional spaces), ˆθ is an efficient estimator of the posterior mean and Σ is the posterior covariance. Contraction rates are typically established by constructing hypothesis tests with exponentially small error probability Schwartz [1965], for example in [Ghosal et al., 2000; Ghosal and van der Vaart, 2007; Gugushvili et al., 2015; Nickl and S¨ohl, 2015] in various nonparametric settings. The main drawback of this approach is the need to be able to construct exponentially consistent tests, which is rarely possible when the likelihood is intractable.

The Bernstein-von Mises theorem bridges the Bayesian and frequentist worlds by enabling the computation of asymptotic, frequentist estimators and confidence regions from the posterior. The earliest proof in a parametric setting was pub- lished by Doob [1949], and the modern form for parametric statistics was developed by Le Cam [1986]. Like contraction rates, nonparametric versions of Bernstein-von Mises theorems are an emerging and challenging area of research [Castillo and Nickl, 2013, 2014]. In this thesis I focus on establishing posterior consistency, and neglect more advanced notions of posterior contraction.

I will conclude this section by presenting an example nonparametric prior. Analytic formulae are rarely available in infinite dimensional spaces, and priors are typical specified by providing a sampling algorithm instead. Perhaps the most famous nonparametric prior is the Dirichlet process [Ferguson, 1973], the support of which consists of a.s. discrete probability measures on a general space Ω. Letζ

is due to Sethuraman [1994], and is called thestick-breaking construction:

• Sample{zi}i∈NIID∼ ζ.

• Sample{β˜i}i∈NIID∼ Beta(1, α).

• For eachi∈N setβi =Qki−=11(1−β˜k) ˜βi with the convention Q0k=1= 1.

• A draw from the Dirichlet process is given byP∞i=1βiδzi(·).

The stick-breaking construction can be used to sample Dirichlet processes in practice using truncation with exponentially small error [Ishwaran and James, 2001], and an exact algorithm is available for sampling from measures generated by Dirichlet process priors [Papaspiliopoulos and Roberts, 2008]. A prior placing full mass on absolutely continuous densities can be obtained by using a Dirichlet process as a mixing measure for suitable kernels [Lo, 1984].

For a further overview of Bayesian nonparametric statistics, the interested reader is directed to [Hjort et al., 2010], and references therein.

Documento similar