• No se han encontrado resultados

10. MEDIOAMBIENTE

10.1 CLASIFICACIÓN

10.1.3 ITOPF

Serial dependence or autocorrelation is one of the more common ways that

independence can fail. Serial dependence arises when results close in time

Serial

dependence tend to be too similar (positive dependence) or too dissimilar (negative de-

pendence). Positive dependence is far more common. Serial dependence could result from a “drift” in the measuring instruments, a change in skill of the experimenter, changing environmental conditions, and so on. If there is no idea of time order for the units, then there can be no serial dependence.

A graphical method for detecting serial dependence is to plot the resid- uals on the vertical axis versus time sequence on the horizontal axis. The

Index plot to detect serial dependence

plot is sometimes called an index plot (that is, residuals-against-time index). Index plots give a visual impression of whether neighbors are too close to-

6.3 Assessing Violations of Assumptions 121

Table 6.2:Temperature differences in degrees Celsius between two thermocouples for 64 consecutive readings, time order along rows. 3.19 3.15 3.13 3.14 3.14 3.13 3.13 3.11 3.16 3.17 3.17 3.14 3.14 3.14 3.15 3.15 3.14 3.15 3.12 3.05 3.12 3.16 3.15 3.17 3.15 3.16 3.15 3.16 3.15 3.15 3.14 3.14 3.14 3.15 3.13 3.12 3.15 3.17 3.16 3.15 3.13 3.13 3.15 3.15 3.05 3.16 3.15 3.18 3.15 3.15 3.17 3.17 3.14 3.13 3.10 3.14 3.07 3.13 3.13 3.12 3.14 3.15 3.14 3.14

gether (positive dependence), or too far apart (negative dependence). Positive dependence appears as drifting patterns across the plot, while negatively de- pendent data have residuals that center at zero and rapidly alternate positive and negative.

The Durbin-Watson statistic is a simple numerical method for checking

serial dependence. Letrk be the residuals sorted into time order. Then the Durbin-Watson

statistic to detect serial dependence

Durbin-Watson statistic is:

DW = Pn−1 k=1(rk− rk+1)2 Pn k=1r2k .

If there is no serial correlation, the DW should be about 2, give or take sam- pling variation. Positive serial correlation will make DW less than 2, and negative serial correlation will make DW more than 2. As a rough rule, se- rial correlations corresponding to DW outside the range 1.5 to 2.5 are large enough to have a noticeable effect on our inference techniques. Note that DW itself is random and may be outside the range 1.5 to 2.5, even if the errors are uncorrelated. For data sets with long runs of units from the same treatment, the variance of DW is a bit less than 4/N .

Temperature differences Example 6.3

Christensen and Blackwood (1993) provide data from five thermocouples that were inserted into a high-temperature furnace to ascertain their relative bias. Sixty-four temperature readings were taken using each thermocouple, with the readings taken simultaneously from the five devices. Table 6.2 gives the differences between thermocouples 3 and 5.

We can estimate the relative bias by the average of the observed differ- ences. Figure 6.4 shows the residuals (deviations from the mean) plotted in

122 Checking Assumptions -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0 10 20 30 40 50 60 Time order R e s i d u a l s

Figure 6.4:Deviations from the mean for paired differences of 64 readings from two thermocouples, using MacAnova.

time order. There is a tendency for positive and negative residuals to cluster in time, indicating positive autocorrelation. The Durbin-Watson statistic for these data is 1.5, indicating that the autocorrelation may be strong enough to affect our inferences.

Spatial association, another common form of dependence, arises when

units are distributed in space and neighboring units have responses more

Spatial

association similar than distant units. For example, spatial association might occur in

an agronomy experiment when neighboring plots tend to have similar fertil- ity, but distant plots could have differing fertilities.

One method for diagnosing spatial association is the variogram. We make a plot with a point for every pair of units. The plotting coordinates for a pair are the distance between the pair (horizontal axis) and the squared

Variogram to detect spatial association

difference between their residuals (vertical axis). If there is a pattern in this figure—for example, the points in the variogram tend to increase with in- creasing distance—then we have spatial association.

This plot can look pretty messy, so we usually do some averaging. Let

Dmaxbe the maximum distance between a pair of units. Choose some num-

Plot binned averages in variogram

ber of bins K, say 10 or 15, and then divide the distance values into K

6.3 Assessing Violations of Assumptions 123 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 x y 0 1 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0

Figure 6.5:Horizontal (x) and vertical (y) locations of good (1) and bad (0) integrated circuits on a wafer

Now plot the average of the squared difference in residuals for each group of pairs. This plot should be roughly flat for data with no spatial association; it will usually have small average squared differences for small distances when there is spatial association.

Defective integrated circuits on a wafer Example 6.4

Taam and Hamada (1993) provide an example from the manufacture of inte- grated circuit chips. Many IC chips are made on a single silicon wafer, from which the individual ICs are cut after manufacture. Figure 6.5 (Taam and Hamada’s Figure 1) shows the location of good (1) and bad (0) chips on a single wafer.

Describe the location of each chip by its x (1 to 9) and y (1 to 8) coor-

dinates, and compute distances between pairs of chips using the usual Eu- clidean distance. Bin the pairs into those with distances from 1 to 2, 2 to 3, and so on. Figure 6.6 shows the variogram with this binning. We see that chips close together, and also chips far apart, tend to be more similar than those at intermediate distances. The similarity close together arises because the good chips are clustered together on the wafer. The similarity at large distances arises because almost all the edge chips are bad, and the only way to get a pair with a large distance is for them to cross the chip completely.

124 Checking Assumptions 0.3 0.35 0.4 0.45 0.5 0.55 1 2 3 4 5 6 7 8 Distance A v e d i f f ^ 2

Figure 6.6:Variogram for chips on a wafer.

Documento similar