Fase Preparatoria
A) Verticales, porque en la mayoría de los casos sólo el docente habla, dirige la palabra al grupo, es quien da o niega los turnos decidiendo quien habla y tiene
This section presents a sequential Monte Carlo implementation of our non-stationary Gaussian process. The approach relies on particle learning (Lopes et al., 2011), which naturally blends with active learning of the design. The discussion of a two stage approximation of the proposed emulator is deferred to Section 5.6.
First, we need to identify particles tStpiquNi1, which contain all the sufficient information about the uncertainties given data up to time t, with N denoting the total number of particles. The sufficient information necessarily depends upon rpx1, fpx1qq, . . . , pxt, fpxtqqs1, thustStpiquNi1 tpZ1:t, Kt, ˜Ktqpiqu, with Z1:t pZ1, . . . , ZtqT.
The correlation functions have been indexed by t to stress their dependency to the data collected up to time t. Particles do not contain β nor σ2 as these parameters can
be marginalized out within our Bayesian construction (Gramacy and Polson, 2011). Particles are initialized at time t0 ¡ p 1 with a sample of the unknown parame-
1 For coherence, we remark one should write fpx
1, Z1q, . . . , fpxt, Ztq since the simulator is re-
garded as function of both the known and latent inputs. In the remainder, however, we will write fpx1q, . . . , fpxtq to simplify the notation.
ters from their prior distributions. The algorithm for updating particlestStpiquN i1 to
tSt 1piq uNi1 cycles through the following steps:
• Resample Generate index ζ Multinomialpw, Nq, with
wpiq πpfpxt 1q | S piq t q °N i1πpfpxt 1q | Stpiqq , i 1, . . . , N,
where πpfpxt 1q | Stpiqq πpfpxt 1q | rx, fpxqs1:t, Ktpiqq denotes the probability
of observing fpxt 1q under a Student-t distribution Gramacy and Polson (2011)
• Propagate Stζpiqto St 1piq : propagate each particle Stζpiqto account forrxt 1, fpxt 1qs
– The first step requires constructing the “propagated” correlation functi- on of the latent GP, which will be used to sample the latent coordinate at the new input xt 1. Thus, we build ˜Kt 1piq from ˜K
piq t and ˜kpiqt pxt 1q ˜ Kpiqpxt 1, xjq, with j 1, . . . , t ˜ Kt 1piq ˜ Ktpiq ˜kpiqt pxt 1q ˜ ktpiqJpxt 1q ˜Kpiqpxt 1, xt 1q
– We obtain Zt 1piq gpiqpxt 1q from its predictive distribution gpiqpxt 1q |
gpiqpx1:tq, ˜Ktpiq Npµpiq, ˜Kpiqq, where the mean and covariance are ob-
tained via standard kriging equations
– We construct the “propagated” correlation function of f . We build Kt 1piq
from Ktpiq and ktpiqpxt 1q Kpiqpxt 1, xjq, j 1, . . . , t, as
Kt 1piq
Ktpiq kpiqt pxt 1q
ktpiqJpxt 1q Kpiqpxt 1, xt 1q
Notice that the three sub-steps above can be performed in parallel across par- ticles, with considerable gain in terms of computational speed.
The correlation range parameters and the latent input could be deterministically propagated by copying them from Stζpiq to St 1piq since they do not change in t. Al- though this strategy is fast, it could lead to particle depletion in future resampling steps. To avoid degeneracy, we include a “rejuvenate” step which applies Markov Chain Monte Carlo (MCMC) moves to the particles after the propagating step Gilks and Berzuini (2001); Ridgeway and Madigan (2003). The update is done via ellipti- cal slice sampling Murray et al. (2010).
We remark that each particle returns an estimate of predictive mean surface, ˆfpiq, and predictive standard deviation, ˆσpiq. Likely, some of these particles will provide higher fidelity surfaces than others. We will take the average of the point-wise pre- dictive distribution for each of the particles, the posterior mean predictive curve, as our prediction of f at new inputs
ˆ f Epf | Spiqq 1 N N ¸ i1 ˆ fpiq, (5.6)
whereas the estimate for the predictive standard deviation is obtained as
ˆ σ ! Erpˆσi2qNi1s varrp ˆfpiqqNi1s )1 2 . (5.7)
We are currently working on an efficient C++ implementation of the particle learn- ing algorithm to be used in an R package. The prototype R code is available upon request.
Several authors have developed specific criteria for sequentially selecting new in- put points. For instance, Jones et al. (1998) proposed an expected improvement criterion to estimate the global minimum of a computer simulator via the maximum likelihood estimator for the emulator parameters. Equivalently popular approaches are the so-called active learning criteria such as active learning MacKay (MacKay,
1992) and active learning Cohn (Cohn, 1996). Seo et al. (2000) compared active learning MacKay and active learning Cohn and observed that active learning Cohn often performs better than active learning MacKay. For example, the active learning MacKay criterion embedded into a stationary Gaussian process emulator favors the selection of new points along the boundary of the input space in that the predictive variance is largest beyond the points which are already in the design (MacKay, 1992). However, the active learning Cohn criterion is more intensive to implement, therefore we will adopt active learning MacKay in our numerical examples for computational feasibility.
Active learning MacKay-based selection of future inputs sits comfortably within our particle learning implementation. After particles have been resampled, the al- gorithm performs prediction at a set of candidate input configurations based on the posterior predictive distribution (see Gramacy and Polson (2011) for more de- tails). Active learning MacKay induces an ordering among candidate points based on their predictive standard deviation and the point with largest standard deviation in predicted output is chosen as the next input xt 1. Consequently, particles are prop-
agated with the new pair rxt 1, fpxt 1qs, and the sequence is iterated until some
pre-specified stopping criterion is met, e.g. the largest predictive standard deviation falls below a certain threshold or a total number, T , of points has been included in the design.