Capítulo VII: Implementación Estratégica
7.1 Objetivos de Corto Plazo
MGK optimal neighbors and kriging variance
MGK allows rapid mapping with nscores and was used to obtain the optimal number of neighbors for the kriging procedure. Because there is not a unique solution with CV for SGS, the solution from MGK was used as a reference for the neighbor search optimization. Figure 5.15 presents the CV curve for neighbors optimization. For MGK, the CV error is slightly improved beyond 20 neighbors. This parameter was further considered as a reference for the minimum number of neighbors to be used.
Figure 5.15: CV neighbors optimization curve for set 3 by MGK of nscores
MGK also shows that the kriging variance is lower within the constrained net in com- parison to the rectangular net (Figure 5.16), thus, indicating a possible advantage with the use of the constrained net.
Figure 5.16: MGK kriging variance distribution for the set 3 using a) a rectangular net and b) a constrained net
Combined effect of neighbors and nets on variogram reproduction
The objective of the proposed tests is to verify whether, as required for SGS, the variogram model proposed was reproduced after simulation. The variogram reproduction was evalu- ated under different parameters of numbers of neighbors and types of simulation nets used. The previous inquiries, have shown that a minimum of 20 neighbors might be required for the simulations. The next thing to check is whether there is an optimal maximum or whether the use of specific ranges of neighborhood can influence variogram reproduction. Hence, the neighbor search parameter was tested using an option with a minimum of 20 and a maximum of 30 neighbors. The second option was an interval ranging from 40 to 60 neighbors. It is expected that the use of more neighbors will produce more smoothing. These tests were combined with the two types of simulation nets.
For the first run, the number of minimum neighbors was fixed at 20 and the maximum at 30. Figure 5.17a shows the simulated variogram using the constrained simulation net (dashed line) and the variogram when using the rectangular net (continuous thin line) on top of the experimental variogram (bold line). For the comparison, a single simulation with the same random seed was used. We see that a better reproduction of middle range was obtained using the rectangular grid, while the short range seems to be better reproduced using the constrained grid (Figure 5.17b).
For the second run, the number of neighbors was increased to a minimum of 40 and a maximum of 60. The results in Figure 5.18a show a better reproduction of the variogram at a long range for both nets and almost the same reproduction at a short range. The use of a defined interval of neighbors has a clear influence on variogram reproduction. In theory, all points can be used for a kriging system but weighting decreases exponentially for points far apart. Hence, is important to limit neighbors to the optimal maximum necessary in order to reduce time calculations.
According to these results, a fairly accurate reproduction of the variogram during sim- ulations can be obtained with a minimum of 40 neighbors. It is also important to verify that
Figure 5.17: Variogram reproduction using a constrained net (dashed line) and a rectangular net (continuous thin line) for a minimum of 20 and a maximum of 30 neighbours for a) the long range and b) the short range
Figure 5.18: Variogram reproduction using a constrained net (dashed line) and a rectangular net (continuous thin line) for a minimum of 40 and a maximum of 60 neighbours for a) the long range and b) the short range
these simulated variograms have fluctuations that do not exceed the model. In Figure 5.19a, the fluctuations of 5 simulations are shown using a minimum of 40 neighbors and the rect- angular net. In Figure 5.19b, the corresponding reproduction of the data histogram is also shown.
While the influence of the number of neighbors seems to be clear, the effect of the simu- lation net is not. After thoroughly observing the shape of variogram reproduction in Figure 5.17a, some similarities are seen to exist because of the same seed generator was used. Nev- ertheless, having a difference in the number of points between simulation nets may produce a different spatial distribution of values. The use of a single reproduction was useful to de- tect the effect of neighbors because it was consistent throughout the nets but not for the net testing itself. Fluctuations (as shown in Figure 5.19a) may be larger than the effect of the net influence, even though variograms have been smoothed by using a large lag tolerance.
Figure 5.19: Fluctuations (thin lines) of the a) variogram model and b) the histogram (bold line) for 5 realizations by sequential gaussian simulation
Optimized simulation net and variogram fluctuations
A third simulation net was built as a compromise between the rectangular and the con- strained net. A constrained net considering interior empty spaces and a buffer zone around training points was created (Figure 5.20). The buffer area has a radius of 1500 meters, and this will increase the number of points in the net, while empty interior spaces will reduce them. In the end, there are more points in this net than in the first constrained net.
Figure 5.20: Constrained net with empty interior spaces and buffering zone of 1500 meters
A series of SGS simulations were launched for neighborhood ranges of 1 to 20, 20 to 50 and 40 to 60 n (Figures 5.21 and 5.22) using this constrained net. The series of 5 simulation variograms in Figure 5.22b are superposed to the smoothed training variogram in order to compare them with the results in Figure 5.19a.
Figure 5.21: Fluctuations (thin lines) of the variograms using a) 1 to 20 neighbors and b) 20 to 50 neighbors for 10 realizations by SGS
Figure 5.22: a) Fluctuations (thin lines) of the variograms using 40 to 60 neighbors and b) detail of the short range reproduction
60 n. In general, the reproduced variograms do not adjust at the middle range using the constrained net with a buffer zone. It seems that the sequential performs a better variogram reproduction when a bounding box net is used. The presence of filling points may contribute to a reproduction at all ranges.
It can be concluded that the reproduced variogram does not adjust the experimental variogram at some ranges. The question is wheter the reproduced variogram is wrong or whether it represents a possible realization. The point here is that variogram reproduction also depends on the simulation net. In the case of indoor radon, this net provides some realistic information, which is the global sampling domain. Therefore, it is acceptable to think that the reproduced variogram reflects, to a certain extent, the spatial distribution of the variable.
A model was fitted over a simulated variogram in order to analyze the posterior spatial distribution. The variogram has a coding of nug0.55+exp0.24R850m+exp0.21R24000. The main difference with the training variogram model is that instead of a spherical model the posterior one will fit better to an exponential. The exponential model proposes that higher variance exists at middle ranges (between 3000 and 10000 meters). This can reflect higher
differences between localities. It can also be an artificial effect created from the net; in any case, the urban domain used for this test is only an approximation, which still requires some refinement. The use of more accurate domains will be stressed in following analysis for set3B.
5.3.4 Probability maps for set3 with SGS
Considering a minimum of 40 neighbors and the previously defined simulation nets as hy- per parameters, 100 simulations were run for the training data. Each of these simulations produced a prediction map, which are represented in Figure 5.23. A set of four maps for the first simulations after back-transformation from nscores is presented. These images are pos- sible realizations of the joint Gaussian distribution obtained with the sequential mechanism. A large number of realizations are then required to build a complete pdf for every location in the simulation net. From this pdf, it is possible to calculate the probability of exceeding a certain threshold value. In Figure 5.24, the map of probability exceeding 200 Bq/m3 is shown using the rectangular and the constrained nets. Visually, the probability maps using both nets look alike.
Figure 5.23: Four simulation maps of the joint distribution realization using SGS with rectangular net
The measure of estimation error or uncertainty is a useful information for decision mak- ing to be added to probability maps. For simulations, the uncertainty can be expressed by the variance of simulated values at each location. It is calculated for each point from the set of realizations. It is an expression of the fluctuations and it mainly depends on the condi- tional data. The proportional effect between local mean and variance, analyzed in chapter 3,
Figure 5.24: Probability maps for the 200 Bq/m3 threshold using SGS method with a) a rectangular net and b) a constrained net
Figure 5.25: probability maps with SGS for cutoff values of a) 200 and b) 400 Bq/m3
is appearing here once more. It is clear that higher local variances will generate more fluctu- ations of the estimates. Here, it is proposed that maps should be produced by combining the probability information with the simulations’ uncertainty.
In Figure 5.26a, the p-values map for the 200 Bq/m3 threshold, using a simplified cate- gorization, is presented. Next to it, is a map of the kriging variance displayed with propor- tional symbols (Figure 5.26b). Larger dark circles correspond to a larger variance. Then, the probability map can be filtered by superimposing the proportional variance representation, as shown in Figure 5.27.
This mixed cartography pretends to proportionally mask the areas where uncertainty is more elevated. In the case of set3, we observe that uncertainty is mainly influenced by the conditional data and is higher in the northwest area.