• No se han encontrado resultados

ESTARÍA DE ACUERDO SI EL SNME PROPUSIERA:

¿En su trabajo cotidiano tiene relación con el SNME?

ESTARÍA DE ACUERDO SI EL SNME PROPUSIERA:

By a branch-and-bound algorithm we were able to find maximin LHDs in three dimensions for small n and the three distance measures `2, `1, and `. The maximin distances are

given in Table 3.13.

Size n 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Squared maximin `2-distance 3 6 6 11 14 17 21 22 27 30 36 41 42 48

Maximin `1-distance 3 4 4 5 6 6 7 8 8 8 9 10 10 11 11

Maximin `∞-distance 1 2 2 2 3 3 4 4 4 4 5 5 5 6 6 6

Table 3.13: Maximin distances for LHDs in three dimensions.

The corresponding maximin designs and all other (approximate) maximin LHDs that ap- peared in this chapter can be downloaded from http://www.spacefillingdesigns.nl. In two dimensions, the `∞-maximin distance is equal to bn12c; see Van Dam et al.

(2007). The results in three dimensions suggest that the corresponding `∞-maximin

distance equals bn23c. A natural extension would be that the `∞-maximin distance in k

dimensions equals d = bnk−1k c. However, this is not the case in general, because for the

case n = 17 and k = 23 the optimal distance is smaller than b172223c = 15 according to

Proposition 3.6. The expression for d may, however, still provide an upper bound for the maximin distance.

Another interesting point is that we conjecture—but were unable to prove—that the analogue of Lemma 3.5 holds for the `2- and `1-distance measures, i.e., that also for these

distance measures the maximin distance is non-decreasing in n.

3.5.2

Conclusions

We have obtained bounds for the separation distance of LHDs for several distance mea- sures. These bounds are useful to assess the quality of approximate maximin LHDs by comparing their separation distances with the corresponding upper bounds. For the `2-

and `1-distances we obtain bounds by considering the average distance. These bounds

are almost tight when the dimension k is relatively large. For the `2-distance in two di-

mensions we obtain a method that produces a bound that is better than Oler’s bound if the number of points of the LHD is at most 400. For the `∞-distance we obtain a bound

by looking at it as a graph covering problem. Besides this bound we construct maximin LHDs attaining Baer’s bound for infinitely many values of n (the number of points) in all dimensions. Finally, we present a method for obtaining a bound for three-dimensional LHDs that is better than Baer’s bound for many values of n.

3.A

Bounds on two-dimensional `

2

-maximin LHDs

n Oler d∗2 n(Y (d)) d∗2n( eY (d)) d2 n Oler d∗2n(Y (d)) d∗2n( eY (d)) d2 n Oler d∗2n( eY (d)) d2 2 5 2 2 2 59 85 73 73 61 120 162 148 128 3 5 2 2 2 60 85 73 73 65 130 173 160 145 4 8 5 5 5 61 85 74 74 65 140 185 173 149 5 10 5 5 5 62 89 74 74 65 150 200 185 170 6 10 5 5 5 63 90 74 74 65 160 212 202 178 7 13 8 8 8 64 90 74 74 65 170 225 208 185 8 13 8 8 8 65 90 80 80 68 180 234 225 202 9 17 10 10 10 66 90 80 80 68 190 245 234 208 10 18 13 13 10 67 90 80 82 74 200 261 250 218 11 20 13 13 10 68 97 80 85 74 210 274 261 241 12 20 13 13 13 69 98 85 85 74 220 281 274 245 13 20 13 13 13 70 98 85 85 74 230 298 290 250 14 25 17 17 17 71 100 85 85 74 240 306 298 269 15 26 17 17 17 72 101 85 89 74 250 320 314 277 16 26 18 18 17 73 101 85 89 74 260 333 325 292 17 29 20 20 18 74 104 89 89 74 270 346 338 305 18 29 20 20 18 75 106 89 90 80 280 360 349 320 19 32 25 25 18 76 106 90 90 85 290 370 365 320 20 32 25 25 18 77 106 97 97 85 300 377 373 338 21 34 25 25 20 78 109 97 97 85 310 394 388 346 22 34 26 26 25 79 109 97 97 85 320 405 401 356 23 37 29 29 26 80 109 97 97 85 330 416 410 370 24 37 29 29 26 81 113 100 100 85 340 433 425 386 25 40 29 29 26 82 113 100 101 85 350 445 442 401 26 41 29 29 26 83 116 100 104 90 360 457 450 409 27 41 32 32 26 84 117 100 104 90 370 468 464 410 28 41 34 34 29 85 117 100 106 90 380 481 477 425 29 45 34 34 29 86 117 104 106 97 390 493 490 442 30 45 34 34 29 87 117 106 106 97 400 505 505 450 31 45 34 37 32 88 122 106 106 97 410 514 514 461 32 45 37 40 32 89 122 106 109 97 420 522 530 466 33 50 40 40 34 90 125 109 109 98 430 541 544 485 34 52 41 41 37 91 125 109 109 98 440 549 549 490 35 53 41 41 37 92 125 113 113 98 450 565 565 509 36 53 41 41 37 93 128 113 116 100 460 578 580 509 37 53 45 45 37 94 130 116 116 100 470 586 592 533 38 53 45 45 41 95 130 117 117 100 480 601 601 545 39 58 45 45 41 96 130 117 117 101 490 613 617 549 40 58 45 50 41 97 130 117 117 101 500 626 629 565 41 61 45 52 41 98 130 117 122 101 510 637 641 578 42 61 50 52 41 99 136 117 125 101 520 650 656 586 43 61 52 52 41 100 137 117 125 109 529 661 661 586 44 65 52 52 50 101 137 117 125 109 45 65 52 53 50 102 137 125 125 113 46 68 53 53 50 103 137 125 125 113 47 68 58 58 50 104 137 125 130 117 48 68 58 58 50 105 137 128 130 117 49 72 58 58 50 106 145 130 130 117 50 73 61 61 52 107 146 130 130 117 51 74 61 61 52 108 146 130 130 117 52 74 61 65 58 109 149 130 136 117 53 74 61 65 58 110 149 130 136 117 54 74 61 65 58 111 149 136 136 128 55 80 65 65 58 112 149 136 137 128 56 80 65 68 58 113 153 137 137 128 57 82 68 68 58 114 153 137 137 128 58 82 68 73 61

Table 3.14: Oler bound, bounds based on Y (d) and eY (d), and d2 of the best known LHD.

Nested maximin

Latin hypercube designs

Mathematic puns are the first sine of madness. (Johann Von Haupkoph)

4.1

Introduction

Latin hypercube designs are very useful in the approximation of black-box functions. By definition, black-box functions have no explicit description, but can be evaluated to ob- tain output values for specific input values. As evaluations of a black-box function often involve time-consuming computer simulations, we would like to construct an approximat- ing model (or metamodel) based on evaluations in a (small) number of points; see, e.g., Montgomery (2009), Sacks et al. (1989a), (1989b), Myers (1999), Jones et al. (1998), Booker et al. (1999), Den Hertog and Stehouwer (2002), Santner et al. (2003), Queipo et al. (2005), Wang and Shan (2007), and Kleijnen (2008). A review of metamodeling applications in structural optimization can be found in Barthelemy and Haftka (1993), and in multidisciplinary design optimization in Sobieszczanski-Sobieski and Haftka (1997) and Simpson et al. (2008).

We use the term design to denote the set of evaluation points. As observed by many researchers, there is an important distinction between designs for computer experiments and designs for the more traditional response surface methods. Physical experiments exhibit random errors whereas computer experiments are often deterministic (see, e.g., Simpson et al. (2004), Forrester et al. (2006), and Forrester et al. (2008)). Therefore, designs for experiments often evaluate certain points multiple times. For designs for computer experiments, replication is redundant because the same input always results in the same output. This distinction is crucial, so one of the main aims in the field of design

of computer experiments (DoCE) is therefore to obtain efficient designs for computer experiments.

As is recognized by several authors, a design for computer experiments should at least satisfy the following two criteria (see Johnson et al. (1990) and Morris and Mitchell (1995)). First of all, the design should be space-filling in some sense. When no details on the functional behavior of the response parameters are available, it is important to be able to obtain information for the entire design space. Therefore, design points should be “evenly spread” over the entire region. Secondly, the design should be non-collapsing. When one of the design parameters has (almost) no influence on the black-box function value, two design points that differ only in this parameter will “collapse”, i.e., they can be considered as the same point that is evaluated twice. As evaluation of the deterministic black-box function is often time-consuming, this is not a desirable situation. Therefore, two design points should not share any coordinate values when it is not known a priori which parameters are important. Moreover, we would like the projections of the points onto the axes to be separated as much as possible. When we consider a black-box function on a box-constrained domain, this can be accomplished by using Latin hypercube designs. A Latin hypercube design (LHD) of n points in k dimensions can be defined as an n × k matrix, were each column is a permutation of the set {0, 1

n−1,n−12 , . . . , 1}. The rows

xi = (xi1, xi2, . . . , xim), i = 1, . . . , n, then define the n design points. Because the columns

are permutations of the above set, for all of the k coordinates it holds that no two design points have the same value.

To obtain space-filling designs, the evaluation points are chosen in such a way that the separation distance (i.e., the minimal distance among any pair of points) is maximized, leading to so-called maximin designs. Other space-filling designs, like minimax, integrated mean squared error (IMSE), Audze-Eglais, discrepancy and maximum entropy designs, are also used in the literature. For a good survey of these designs see the book of Santner et al. (2003). Goel et al. (2008) argue that it would be better to use several criteria when selecting a design. However, Santner et al. (2003) show that maximin Latin hypercube designs—generally speaking—yield good approximations.

Maximin Latin hypercube designs were first constructed by Morris and Mitchell (1995) using simulated annealing. Ye et al. (2000) considered only the class of symmetric ap- proximate maximin LHDs to reduce the computing effort. Jin et al. (2005) introduce the enhanced stochastic evolutionary (ESE) algorithm for finding various space-filling designs, including approximate maximin LHDs. Husslage et al. (2008) use the ESE algorithm to construct approximate maximin LHDs for up to 10 dimensions and up to 300 design points. Furthermore, they also construct approximate maximin LHDs by optimizing the maximin criterion over all LHDs having a certain periodic structure. This approach is an extension of the method used by Van Dam et al. (2007) to ob-

tain two-dimensional approximate maximin LHDs. In that paper, two-dimensional max- imin LHDs are also found using a branch-and-bound algorithm. Finally, Grosso et al. (2009) use Iterated Local Search heuristics to find good approximate maximin LHDs for up to 10 dimensions. The best designs found in these papers are published on-line at http://www.spacefillingdesigns.nl. This website also contains the upper bounds on the separation distance for certain classes of maximin LHDs found by Van Dam et al. (2009b). These upper bounds can be used to asses the quality of approximate maximin LHDs.

In real-life, there are situations where we need a special type of designs called nested designs. This type of design consists of two separate designs, with the requirement that one design is a subset of the other design. Van Dam et al. (2009a) show how to construct one-dimensional nested maximin designs; the current chapter focuses on two- and higher-dimensional designs1. Four main reasons for nesting maximin designs are:

validation, models with different levels of accuracy, linking parameters, and sequential evaluations.

To start with the first reason, consider the problem of fitting and validating a partic- ular metamodel. In practice, the following approach is often used. First, a metamodel is fitted to the responses obtained when evaluating the design points in the training set. Then, a new set of design points—called the test set—is evaluated and the responses are compared with the response values predicted by the metamodel. If the differences between the predicted and the actual response values are small, the metamodel is consid- ered to be valid; see also Cherkassky and Mulier (1998) for a more detailed description of the use of training and test sets. Because a metamodel should be a global approximation model, i.e., it should be valid for the entire feasible region, both the training set and the test set should cover the entire region. Moreover, the design points in the test set should not lie too close to the design points in the training set, i.e., the total set of design points should be space-filling. This can be accomplished by nesting two designs that are optimized with respect to, for example, the maximin criterion. The design points that are in both designs then form the training set and the points that are only in the large design make up the test set.

The second reason applies when an output variable of a process, product, or system is modeled by two black-box functions with different accuracy levels. These black-box functions could, for instance, be simulation models with different levels of detail. As a more accurate model is in general also more time-consuming, we can perform fewer evaluations of the high-accuracy model than of the low-accuracy model in the same amount of time. Instead of choosing to use either the high or low-accuracy model, we can choose to use both. We can then evaluate the high-accuracy model at all points in

the small design and the low-accuracy model at all points in the large design. By using a nested design, high and low-accuracy evaluations are thus performed at all points in the small design. Multi-fidelity methods can combine the results from both models to obtain a metamodel that is better than a metamodel obtained by using only one of the two models and the same amount of time. More information on multi-fidelity methods can be found in Cressie (1993), Kennedy and O’Hagan (2000), Qian et al. (2006), and Forrester et al. (2007).

A third reason for using nested designs concerns linking parameters. Consider a prod- uct that consists of two components, each represented by a separate black-box function. To obtain an approximating model describing the behavior of the complete product, we need function evaluations of each black-box function. When one black-box function is more time-consuming to evaluate than the other, it could be better to perform different numbers of function evaluations of each black-box function. Moreover, in practice it may occur that these functions have input parameters in common; such parameters are called linking parameters, see Husslage et al. (2003). Evaluating the linking parameters at the same setting in both functions leads to an evaluation of the product. Not only do product evaluations provide a better understanding of the product, they are also very useful in the product optimization process. Another reason for using the same settings for (linking) parameters is due to physical restrictions on the black-box functions. Set- ting the parameters for computer experiments can be a time-consuming job in practice, because characteristics, such as shape and structure, have to be redefined for every new experiment. Therefore, it is preferable to use the same settings as much as possible. By constructing nested designs, we can determine the settings for linking parameters.

Nested designs are also useful when dealing with sequential evaluations. In practice it is common that after evaluating an initial set of points, extra evaluations are needed. As an example, suppose we construct an approximating model for some black-box function based on n1 function evaluations. However, after validating the obtained model, it turns

out that an extra set of function evaluations is needed to build a better model. We then face the problem of constructing a design on a total of n2 points given the initial design

on n1 points with n2 > n1. To anticipate the possibility of extra evaluations, one can

construct the two designs (on n1 and n2 points) at once, by constructing a nested design.

An alternative method to deal with this situation would be sequential sampling. As this is beyond the scope of this chapter, we refer to Jones et al. (1998), Jin et al. (2002), and Kleijnen and Van Beers (2004) for more information on sequential sampling.

Above, we described why both Latin hypercube designs and nested designs are im- portant. In this chapter, we construct nested maximin Latin hypercube designs in k dimensions with k ≥ 2. Section 4.2 gives a more detailed formulation of this problem. When nesting two designs, it is not always possible to satisfy the LHD-structure for

both designs. Therefore, we introduce in Section 4.3 three different grid-structures that approximate the LHD-structure as good as possible. In Section 4.4, we present a branch- and-bound method for determining two-dimensional nested maximin designs and discuss two-dimensional Pareto optimal nested designs. For higher dimensions, determining the nested LHD that maximizes d becomes too time consuming. In Section 4.5, we therefore introduce a heuristic that also aims to maximize d but does not guarantee to find the op- timal d. In Section 4.6, numerical results obtained with different variants of this heuristic are presented and compared. Furthermore, we discuss how to select a grid-structure and design based on these results. Finally, Section 4.7 contains conclusions and suggestions for further research.

Documento similar