• No se han encontrado resultados

Resultados del Modelo con Diafragma en la Dirección Transversal

6. EFECTO DEL DIAFRAGMA EN LA DIRECCIÓN TRANSVERSAL

6.2 Resultados del Modelo con Diafragma en la Dirección Transversal

As explained in Remarks 1 and 2, rather than emulating the outputs in M directly, multivariate GP priors are placed over the reduced-dimensional rep- resentations in Fr or D(rt). In the actual approach described in Section 5.4,

univariate GP priors are placed over the individual coefficients ri(·) or zi(·)

and these coefficients are emulated separately. In this section, scalar GPE is therefore outlined.

A scalar valued simulator is a function⌘ :X !Rof inputs⇠⇠⇠ 2X ⇢Rl.

In univariate GPE, a GP prior indexed by ⇠⇠⇠ 2 X is placed over ⌘(⇠⇠⇠) and the emulator is trained using simulator outputs ⌘(⇠⇠⇠(i)) at design points ⇠⇠⇠(i).

The notation t = (⌘(⇠⇠⇠(1)), . . . ,(⇠⇠⇠(m)))T is used. The prior is (⇠⇠⇠)|✓✓✓,

GP(m(⇠⇠⇠),c(⇠⇠⇠,⇠⇠⇠0)), where GP(m(·),c(·,·)) represents a GP with mean and covariance functions m(·) and c(·,·), respectively. The most common choices for the mean function are a linear function or a constant. In this work,m0 was assumed by centering the data. ✓✓✓ is a vector of hyperparameters (e.g., parameters in the covariance function) that are typically unknown a priori. Remark 3. A GP noise term can be added to the model, in which case⌘(⇠⇠⇠)is a latent function while the simulator outputs are the observables: t(⇠⇠⇠) =⌘(⇠⇠⇠) +

✏(⇠⇠⇠), in which ✏(⇠⇠⇠) ⇠ GP(0, 2

n (⇠⇠⇠,⇠⇠⇠0)), where (·,·) is the Kronecker delta.

The noise can represent modelling or simulation errors or can be included for numerical stability. It can be included directly as an additional term in the covariance function c(⇠⇠⇠,⇠⇠⇠0) (a so called ‘jitter’ or ‘nugget’ [159]), which leads to the same result for GP priors over the noise and latent function.

A square exponential covariance function is used:

c(⇠⇠⇠,⇠⇠⇠0) = ✓0exp (⇠⇠⇠ ⇠⇠⇠0)Tdiag(✓1, . . . ,✓l)(⇠⇠⇠ ⇠⇠⇠0) + 2n (⇠⇠⇠,⇠⇠⇠0), (5.16)

where the last term is the jitter, and ✓✓✓ = (✓0, . . . ,✓l, 2n)T. The parameters ✓1, . . . ,✓l are the inverse square correlation lengths. Alternatives to Eq. (5.16) include the Mat´ern class of functions and piecewise polynomials, which are also stationary [58].

The conditional predictive distribution at new inputs⇠⇠⇠ is obtained in a straightforward manner from the joint distribution p(⌘(⇠⇠⇠),t|✓✓✓) [58]:

⌘(·)|t,✓✓✓⇠GP(m0(·;✓✓✓),⌫0(·,·;✓✓✓)), m0(⇠⇠⇠;✓✓✓) = c(⇠⇠⇠)TC 1

t and ⌫0(⇠⇠⇠,⇠⇠⇠0;✓✓✓) = c(⇠⇠⇠,⇠⇠⇠0) c(⇠⇠⇠)TC 1c

(⇠⇠⇠0), (5.17) where C= [Cij] is the covariance matrix with entries Cij =c(⇠⇠⇠(i),⇠⇠⇠(j)), i, j =

1, . . . , m, and c(⇠⇠⇠) = (c(⇠⇠⇠(1),⇠⇠⇠), . . . ,c(⇠⇠⇠(m),⇠⇠⇠))T.

The hyperparameters✓✓✓ are unknown. Point estimates [19, 160] such as the maximum likelihood estimate (MLE) are employed in most cases; that is, the predictive distribution is given by Eq. (5.17) using the MLE estimate. The MLE is given by arg max✓✓✓R(✓✓✓), where R(✓✓✓) = logp(t|✓✓✓) is the log likelihood:

R(✓✓✓) = 1 2ln|C| 1 2t TC 1t m 2 ln(2⇡). (5.18)

In a Bayesian inference approach, predictions at a new input⇠⇠⇠ are made by integrating over ✓✓✓ in the joint distribution of ✓✓✓ and ⌘(⇠⇠⇠) given t (the poste- rior predictive distribution). The integral is analytically intractable but can be approximated using Monte Carlo integration, e.g., importance sampling, or Markov Chain Monte Carlo [161] to sample from the posterior over the hyperparameters p(✓✓✓|t).

5.4

Multi-output emulation using manifold learn-

ing

The problem of emulating⌘⌘⌘ :X !M has been replaced with the problem of emulating the mapzr(⌘⌘⌘(·)) defined by Eq. (2.8) or the map tr(⌘⌘⌘(·)) defined by

Eq. (5.15). Multivariate GP priors are placed over these maps, with training points for emulation given by Algorithms 1 and 2 for kPCA and di↵usion maps, respectively. These multivariate GP priors take a particularly convenient form by assuming independence of the coordinates, as explained below.

The kPCA coefficients,zi(⇠⇠⇠),i= 1, . . . , rare mutually uncorrelated; fol-

lowing Higdonet al. [43] (see also the wavelet decomposition approach in [44]) the approximation is therefore made that they arise from independent GPs. The di↵usion map coefficients iri(⇠⇠⇠), i= 1, . . . , r, on the other hand, are not

uncorrelated. As a simplification, however, the underlying GPs are treated as independent (see Remark 4). For both manifold learning methods, univariate GPE is then performed separately on each coefficient to approximate its value for a new input ⇠⇠⇠. The process is summarized below for each case, making clear the link between the notation of Sections 5.2 and 5.3.

1. kPCA: For a fixed i= 1, . . . , r, ⌘(⇠⇠⇠) =zi(⇠⇠⇠) is set. The training points

are given by Eq. (2.6): ⌘(⇠⇠⇠(j)) =z

i(⇠⇠⇠(j)) =↵↵↵eTi H(kj K1),j = 1, . . . , m.

Recall that zi(⇠⇠⇠(j)) = zi(⌘⌘⌘(⇠⇠⇠(j))) = zi(y(j)). The expected (mean) value

at an input ⇠⇠⇠, given by Eq. (5.17), yields a prediction that is denoted

zi(⇠⇠⇠) (to avoid introducing new notation, there is no distinguish between

zi(⇠⇠⇠) and E[zi(⇠⇠⇠)]). Setzr(⌘⌘⌘(⇠⇠⇠)) = (z1(⇠⇠⇠), . . . ,zr(⇠⇠⇠))T. Again, this is the

expected value E[zr(⌘⌘⌘(⇠⇠⇠))].

ing points are given by Eq. (5.14): ⌘(⇠⇠⇠(j)) =r

i(⇠⇠⇠(j)) =rji, j = 1, . . . , m.

Recall thatri(⇠⇠⇠(j)) = ri(⌘⌘⌘(⇠⇠⇠)(j))) =ri(y(j)). For a new input⇠⇠⇠, Eq. (5.17)

yields E[ri(⇠⇠⇠)], denoted simply as ri(⇠⇠⇠). One then obtains (the expected

value of) t

r(⌘⌘⌘(⇠⇠⇠)) = (( 10)tri(⇠⇠⇠), . . . ,( r0)trr(⇠⇠⇠))T, which approximates t

r(⌘⌘⌘(⇠⇠⇠)) = ( 1tri(⇠⇠⇠), . . . , rtrr(⇠⇠⇠))T. Note that while the GPE provides

a prediction of the function ri(⇠⇠⇠), it can provide no information on the

eigenvalues i = limm!1 i0, which do not depend on ⇠⇠⇠. Thus, the i0

found from Algorithm 2 are used to compute the predicted value of

t

r(⌘⌘⌘(⇠⇠⇠)).

Remark 4. To take account of the correlations between the coefficients when using di↵usion maps, the linear model of coregionalization (LMC) [36, 162] could be used to emulate the coefficients simultaneously. Alternatively, the GP model could be replaced by an artificial neural network (ANN). For moderately sized r, neither approach is computationally expensive. In this chapter, the approach of univariate GPs is compared with ANN using Bayesian regulariza- tion [107, 108].

To complete the emulation, the inverse map must be approximated from the reduced-dimensional space Fr or Dr(t) to the physical space M ⇢ Rd. This

so-called pre-image problem can be solved in a number of ways for kPCA but a stable, computationally efficient solution for di↵usion maps in high- dimensional spaces does not exist. In the next section, details of the inverse map approximations are provided for both methods, including a new pre- image solution for di↵usion maps. The main algorithm for GPE of outputs in high-dimensional spaces is given in Section 5.5.3.

Remark 5. The GPE framework furnishes predictive variances, given by Eq. (5.17). The variances pertain to the coefficients (zi or ri) in an abstract

space and there is no obvious method to translate this information into vari- ances in the predictions y = ⌘⌘⌘(⇠⇠⇠) 2 M. The inverse maps discussed below provide only the predictive means of the points y. However, Monte Carlo (MC) estimates of higher-order statistics can be derived for a fixed input⇠⇠⇠ by drawing samples from the posterior predictive Gaussian distribution (defined by Eq. (5.17)) over the coefficients ri(y) = ri(⇠) or zi(y) = zi(⇠⇠⇠) and using

5.5

Inverse mappings: Reconstruction of points