• No se han encontrado resultados

Stevens and Olsen (2004) presented a spatially balanced design called Generalized Random- ized Tessellation Stratified sampling, or simply GRTS. The GRTS method is explained below. Compared with the previously described spatially balanced sampling designs GRTS has some desirable features. One advantage is that GRTS remains spatially balanced in the case of an

Figure 3.2: Illustration of Stratified Random Sampling with one unit per strata. In this case there are 121 strata.

advanced ending of a planned survey, as long as the units are visited in the specified order. Of course visiting sites in a specific order can be troublesome because of logistics, time and cost restraints. Additionally, GRTS adds the flexibility to add additional units to a sample while maintaining spatial balance. Again, this feature should be traded off against the same travel time problems. GRTS also allows for unequal probability sampling while remaining spatially balanced. These properties make GRTS increasingly popular among environmental scientists for designing sampling designs. GRTS has been applied to sample different types of natural resources such as aquatic resources (Hill et al., 2013; Lackey and Stein, 2013; Widmer et al., 2010), invasive plant species (Lemke et al., 2013) and chemical compounds (Dodder et al., 2012). GRTS has been cited by more than 380 papers. Examples of the use of GRTS are: Widmer et al. (2010) used GRTS to design a sampling designs for small-bodied fishes in a sand-bed river. Dodder et al. (2012) used GRTS to estimate the distribution of polybrominated diphenyl ethers in the Southern California Bight. Lemke et al. (2013) used GRTS in a study to estimate the effect of open surface mines on invasive plant species. Given this popularity of GRTS, we will use GRTS as the benchmark reference for spatially balanced sampling designs.

Methodology

The GRTS methodology is illustrated step by step in Figure 3.3. First, the study area is rescaled to the unit box. Next, the unit box is partitioned hierarchically into quadrants. This hierarchical partitioning is continued until the expected number of units in each quadrant is one. Each quadrant is assigned an unique hierarchical address. Figure 3.3 shows the first three levels of hierarchical quadrant partitioning and the structure of the hierarchical addresses. For example, the shaded quadrant in Figure 3.3 has the address 301.

After this partitioning, a transformation of all the addresses is applied. This is done by using a permutation algorithm on the separate digits of the address of each quadrant. This transformation introduces stochasticity to the sampling design. The reason for this is that otherwise the top left quadrant in Figure 3.3 would always have the address 000 and the bottom right quadrant would always have address 333 and so on.

Next, the permuted addresses are reversed. For example, address 201 becomes 102. Finally, these reversed addresses are put in an increasing order on a line, also known as the real line. Each quadrant obtains a segment of equal length on the real line. A systematic sample is then selected using Brewer and Hanif (1983)’ design. In short: To select a systematic sample from the real line the first segment is chosen as starting point.

Next, as with systematic sampling, segments are selected such that k = Nn, where n is the sample size and N the number of segments. Thus k is the number of segments on the real line between each selected segment. The permutated addresses of the selected segments are transformed back to the original addresses, using the same permutation algorithm but in the reverse direction. Finally, the selected quadrants can be mapped back using the original addresses.

The reason why GRTS selects well spread samples is because of the hierarchical stratification. Assume for example that there is only one level of partitioning, then there would only be 4 squares. If four units would be selected one unit would be selected in each partitioned rectangle. The remaining part of the algorithm is necessary to decide from within which rectangle the first unit would be selected and to allow GRTS to perform unequal probability sampling. In the

case the hierarchical partitioning has more than one level the logic behind the GRTS algorithm becomes more complex. The basic logic of GRTS is that the first four units are selected each within a different rectangle created by the first level of hierarchical partitioning, next the second set of four points will also be each selected from within a different rectangle of the first level of partitioning etc. The second level of partitioning ensures that the first four units that are selected within the first rectangle, created by the first level of partitioning, are each selected within a different rectangle created by the second level of partitioning and so on. An example of the GRTS sample is shown in Figure 2.3a.

Because of the way this algorithm works, the sample remains spatially balanced even in the case of advanced ending of the sample. This is of course only if the selected units are sampled in the order specified by the GRTS algorithm. In practise ensuring that points are sampled in such an order could come with a huge economical cost given the travel time and could turn out to be very ineffective.

Figure 3.3 illustrates the selection of an equiprobable sample using GRTS. Unequal probability sampling can be implemented as introduced in Brewer and Hanif (1983). The length of each quadrant on the real line is rescaled proportional to the inclusion probability of that unit. For example, unit A has an inclusion probability which is double the inclusion probability of unit B. Then the length on the real line of unit A will be twice as long as for unit B. Figure 3.4 visualizes how unequal probability sampling is implemented in GRTS. Implementing unequal probability sampling with GRTS, units with higher inclusion probabilities will be more likely included in the sample, without the loss in spatial balance.

Population Estimation

GRTS has the ability to select an unequal probability sample. For population estimation standard design based estimators can be used, which allows for unbiased estimation, such as the HT-estimator. Second order inclusion probabilities are difficult to compute for GRTS. Furthermore, since it is a spatially balanced sampling design second order inclusion probabilities of neighbouring units are often (near) zero. Therefore, Stevens and Olsen (2003) derived the

301 213 013 221 130 … Permutation 0 à 2 1 à 3 2 à 0 3 à 1 123 031 231 003 312 … Reverse order 321 130 132 300 213 … Sort list 130 132 213 300 321 … Select units Revert address Permute back Use address selected units to sample 333 (1) (2) (3) (4) 332 331 330 … 010 003 002 001 000

Figure 3.3: Step by step illustration of the GRTS methodology. (1) Rescale the study area to the unit box. The area is hierarchically partitioned into quadrants and assign hierarchical address. In this case, the first 3 levels of the hierarchical quadrant partitioning are shown. The address of the shaded unit is 301. (2) The addresses of all the quadrants are transformed using a digit specific permutation algorithm. Next, the transformed addresses are reverted. Finally, the reverted addresses are sorted and put on the real line. (3) Select units using Brewer and Hanif (1983)’ method for systematic sampling on the real line. (4) Re-transform and re-revert the address of the selected units and map them back on to the study area.

Equal 000 001 002 … 332 333 Unequal 333 332 331 330 … 010 003 002 001 000

Figure 3.4: Illustration of Brewer and Hanif (1983)’ method for systematic sampling on the real line for equal probability sampling and unequal probability sampling.

local mean variance estimator for the GRTS method. This approach of the variance estimator is very similar to the approach suggested for systematic sampling where two units are selected in

each stratum. The local mean variance estimator is given by ˆ VNBH( ˆY ) = N X i=1 X j∈Di wij y i πi − ¯yDi 2 Is(ij), (3.16)

where Di is the neighbourhood to unit i that contains at least four observed units and ¯yDi is

unit i’s neighbourhood mean. The weights, wij, decrease as the distance between unit i and unit

j increases and satisfyP

j∈Diwij = 1. For details on how to compute the weights, see Stevens

and Olsen (2003). The local mean variance estimator is not an unbiased estimator but generally tends to overestimate the variance (this is compared with the observed simulated variance) as shown in Stevens and Olsen (2004),Grafstr¨om et al. (2012a) and Robertson et al. (2013). The local mean variance estimator performs best when the selected samples are well spread over the study area. This means that the local mean variance estimator works well for sampling designs that have small second order inclusion probabilities for units that are near each other, like spatially balanced sampling designs.