• No se han encontrado resultados

In many cases, assumptions on distributional characteristics are difficult to verify or difficult to satisfy for both populations. In this case, several distribution-free test procedures are available that compare the shape and location of the two distributions instead of a statistical parameter (such as a mean or median). The statistical tests described below test the null

hypothesis "H0: the distributions of population 1 and population 2 are identical (or, the site is not more contaminated than background)" versus the alternative hypothesis "HA: part of the

distribution of population 1 is located to the right of the distribution of population 2 (or the site is more contaminated than background)." Because of the structure of the hypothesis tests, the labeling of populations 1 and 2 is of importance. For most environmental applications, population 1 is the area of interest (i.e., the potentially contaminated area) and population 2 is the reference area.

There is no formal statistical parameter of interest in the hypotheses stated above. However, the concept of false rejection and false acceptance error rates still applies.

3.3.3.1 The Wilcoxon Rank Sum Test

PURPOSE

The Wilcoxon rank sum test can be used to compare two population distributions based on m independent random samples X1, X2, . . . , Xm from the first population, and n independent random samples Y1, Y2, . . . , Yn from the second population. When applied with the Quantile test (Section 3.3.3.2), the combined tests are most powerful for detecting true differences between two population distributions.

ASSUMPTIONS AND THEIR VERIFICATION

The validity of the random sampling and independence assumptions should be verified by review of the procedures used to select the sampling points. The two underlying distributions are assumed to have the same shape and dispersion, so that one distribution differs by some fixed amount (or is increased by a constant) when compared to the other distribution. For large

samples, to test whether both site distributions have approximately the same shape, one can create and compare histograms for the samples.

LIMITATIONS AND ROBUSTNESS

The Wilcoxon rank sum test may produce misleading results if many data values are the same. When values are the same, their relative ranks are the same, and this has the effect of diluting the statistical power of the Wilcoxon rank sum test. Estimated concentrations should be reported for data below the detection limit, even if these estimates are negative, because their relative magnitude to the rest of the data is of importance. An important advantage of the Wilcoxon rank sum test is its partial robustness to outliers, because the analysis is conducted in

terms of rankings of the observations. This limits the influence of outliers because a given data point can be no more extreme than the first or last rank.

SEQUENCE OF STEPS

Directions and an example for the Wilcoxon rank sum test are given in Box 3-20 and Box 3-21. However, if a relatively large number of samples have been taken, it is more efficient in terms of statistical power to use a large sample approximation to the Wilcoxon rank sum test (Box 3-22) to obtain the critical values of W.

Box 3-20: Directions for the Wilcoxon Rank Sum Test for Simple and Systematic Random Samples

Let X1, X2, . . . , Xn represent the n data points from population 1 and Y1, Y2, . . . , Ym represent the m data points from population 2 where both n and m are less than or equal to 20. For Case 1, the null hypothesis will be that population 1 is shifted to the left of population 2 with the alternative that population 1 is either the same as or shifted to the right of population 2; Case 2 will be that population 1 is shifted to the right of population 2 with the alternative that population 1 is the same as or shifted to the left of population 2; for Case 3, the null hypothesis will be that there is no difference between the two populations and the alternative hypothesis will be that

population 1 is shifted either to the right or left of population 2. If either m or n are larger than 20, use Box 3-22. STEP 1: List and rank the measurements from both populations from smallest to largest,

keeping track of which population contributed each measurement. The rank of 1 is assigned to the smallest value, the rank of 2 to the second smallest value, and so forth. If there are ties, assign the average of the ranks that would otherwise have been assigned to the tied observations.

STEP 2: Calculate R as the sum of the ranks of the data from population 1, then calculate

. W ' R & n(n%1)

2

STEP 3: Use Table A-7 of Appendix A to find the critical value w" (or w"/2 for Case 3). For Case 1, reject the null hypothesis if W > nm - w". For Case 2, reject the null

hypothesis if W < w". For Case 3, reject the null hypothesis if W > nm - w"/2 or W <

w"/2. If the null hypothesis is rejected, go to Step 5. Otherwise, go to Step 4.

STEP 4: If the null hypothesis (H0) was not rejected, the power of the test or the sample size necessary to achieve the false rejection and false acceptance error rates should be calculated. For small samples sizes, these calculations are too complex for this document.

STEP 5: The results of the test could be:

1) the null hypothesis was rejected and it seems that population 1 is shifted to the right (Case 1), to the left (Case 2) or to the left or right (Case 3) of population 2.

2) the null hypothesis was not rejected and it seems that population 1 is shifted to the left (Case 1) or to the right (Case 2) of population 2, or there is no difference between the two populations (Case 3).

EPA QA/G-9 Final

QA00 Version 3 - 33 July 2000

Box 3-21: An Example of the Wilcoxon Rank Sum Test for Simple and Systematic Random Samples

At a hazardous waste site, area 1 (cleaned using an in-situ methodology) was compared with a similar (but relatively uncontaminated) reference area, area 2. If the in-situ methodology worked, then the two sites should be approximately equal in average contaminant levels. If the methodology did not work, then area 1 should have a higher average than the reference area. The null hypothesis will be that area 1 is shifted to the right of area 2 and the alternative hypothesis will be that there is no difference between the two areas or that area 1 is shifted to the left of area 2 (Case 2). The false rejection error rate was set at 10% and the false acceptance error rate was set at 20% ($) if the difference between the areas is 2.5 ppb. Seven random samples were taken from area 1 and eight samples were taken from area 2:

Area 1 Area 2 17, 23, 26, 5 16, 20, 5, 4

13, 13, 12 8, 10, 7, 3

STEP 1: The data listed and ranked by size are (Area 1 denoted by *): Data (ppb): 3, 4, 5, 5*, 7, 8, 10, 12*, 13*, 13*, 16, 17*, 20, 23*, 26* Rank: 1, 2, 3.5, 3.5*, 5, 6, 7, 8*, 9.5*, 9.5* 11, 12*, 13, 14*, 15*

STEP 2: R = 3.5 + 8 + 9.5 + 9.5 + 12 + 14 + 15 = 71..5. W = 71.5 - 7(7 + 1)/2 = 43.5 STEP 3: Using Table A-7 of Appendix A, " = 0.10 and W" = 17. Since 43.5 > 17, do

not reject the null hypothesis.

STEP 4: The null hypothesis was not rejected and it would be appropriate to calculate the probable power of the test. However, because the number of samples is small, extensive computer simulations are required in order to estimate the power of this test which is beyond the scope of this guidance.

STEP 5: The null hypothesis was not rejected. Therefore, it is likely that there is no difference between the investigated area and the reference area, although the statistical power is low due to the small sample sizes involved.

Box 3-22: Directions for the Large Sample Approximation

to the Wilcoxon Rank Sum Test for Simple and Systematic Random Samples

Let X1, X2, . . . , Xn represent the n data points from population 1 and Y1, Y2, . . . , Ym represent the m data points from population 2 where both n and m are greater than 20. For Case 1, the null hypothesis will be that population 1 is shifted to the left of population 2 with the alternative that population 1 is the same as or shifted to the right of population 2; for Case 2, the null hypothesis will be that population 1 is shifted to the right of population 2 with the alternative that population 1 is the same as or shifted to the left of population 2; for Case 3, the null hypothesis will be that there is no difference between the populations and the alternative hypothesis will be that population 1 is shifted either to the right or left of population 2. STEP 1: List and rank the measurements from both populations from smallest to

largest, keeping track of which population contributed each measurement. The rank of 1 is assigned to the smallest value, the rank of 2 to the second smallest value, and so forth. If there are ties, assign the average of the ranks that would otherwise have been assigned to the tied observations.

STEP 2: Calculate W as the sum of the ranks of the data from population 1.

STEP 3: Calculate wp ' mn where p = 1 - " for Case

2

%Zp mn(n % m % 1)/12

1, p = " for Case 2, and zp is the p

th percentile of the standard normal distribution (Table A-1 of Appendix A). For Case 3, calculate both w"/2 (p =

"/2) and w1 - "/2 (p = 1 - "/2).

STEP 4: For Case 1, reject the null hypothesis if W > w1-". For Case 2, reject the null

hypothesis if W < w". For Case 3, reject the null hypothesis if W > w1-"/2 or W < w"/2. If the null hypothesis is rejected, go to Step 6. Otherwise, go to Step 5.

STEP 5: If the null hypothesis (H0) was not rejected, calculate either the power of the test or the sample size necessary to achieve the false rejection and negative error rates. If only one false acceptance error rate ($) has been specified (at

*1), it is possible to calculate the sample size that achieves the DQOs, assuming the true mean and standard deviation are equal to the values estimated from the sample, instead of calculating the power of the test. If m and n are large, calculate:

m( ' n( ' 2s 2(z 1&"%z1&$)2 (*1&* 0) 2 % (0.25)z2 1&" where zp is the p

th percentile of the standard normal distribution (Table A-1 of Appendix A). If 1.16m* # m and 1.16n* # n, the false acceptance error rate has been satisfied.

STEP 6: The results of the test could be:

1) the null hypothesis was rejected, and it seems that population 1 is shifted to the right (Case 1), to the left (Case 2) or to the left or right (Case 3) of population 2.

2) the null hypothesis was not rejected, the false acceptance error rate was satisfied, and it seems that population 1 is shifted to the left (Case 1) or to the right (Case 2) of population 2, or there is no difference between the two populations (Case 3).

3) the null hypothesis was not rejected, the false acceptance error rate was not satisfied, and it seems that population 1 is shifted to the left (Case 1) or to the right (Case 2) of population

EPA QA/G-9 Final

QA00 Version 3 - 35 July 2000

3.3.3.2 The Quantile Test

PURPOSE

The Quantile test can be used to compare two populations based on the independent random samples X1, X2, . . ., Xm from the first population and Y1, Y2, . . ., Yn from the second population. When the Quantile test and the Wilcoxon rank sum test (Section 3.3.3.1) are applied together, the combined tests are the most powerful at detecting true differences between two populations. The Quantile test is useful in detecting instances where only parts of the data are different rather than a complete shift in the data. It essentially looks at a certain number of the largest data values to determine if too many data values from one population are present to be accounted for by pure chance.

ASSUMPTIONS AND THEIR VERIFICATION

The Quantile test assumes that the data X1, X2, . . ., Xm are a random sample from population 1, and the data Y1, Y2, . . ., Yn are a random sample from population 2, and the two random samples are independent of one another. The validity of the random sampling and independence assumptions is assured by using proper randomization procedures, either random number generators or tables of random numbers. The primary verification required is to review the procedures used to select the sampling points. The two underlying distributions are assumed to have the same underlying dispersion (variance).

LIMITATIONS AND ROBUSTNESS

The Quantile test is not robust to outliers. In addition, the test assumes either a systematic (e.g., a triangular grid) or simple random sampling was employed. The Quantile test may not be used for stratified designs. In addition, exact false rejection error rates are not available, only approximate rates.

SEQUENCE OF STEPS

The Quantile test is difficult to implement by hand. Therefore, directions are not included in this guidance but the DataQUEST software (EPA, 1996) can be used to conduct this test. However, directions for a modified Quantile test that can be implemented by hand are contained in Box 3-23 and an example is given in Box 3-24.

Box 3-23: Directions for a Modified Quantile Test for Simple and Systematic Random Samples

Let there be ‘m’ measurements from population 1 (the reference area or group) and ‘n’ measurement from population 2 (the test area or group). The Modified Quantile test can be used to detect differences in shape and location of the two distributions. For this test, the significance level (") can either be approximately 0.10 or approximately 0.05. The null hypothesis for this test is that the two population are the same (i.e., the test group is the same as the reference group) and the alternative is that population 2 has larger measurements than population 1 (i.e., the test group has larger values than the reference group).

STEP 1: Combine the two samples and order them from smallest to largest keeping track of which sample a value came from.

STEP 2: Using Table A-13 of Appendix A, determine the critical number (C) for a sample size n from the reference area, sample size m from the test area using the significance level ". If the Cth largest measurement of the combined population is the same as others, increase C to include all of these tied values.

STEP 3: If the largest C measurements from the combined samples are all from population 2 (the test group), then reject the null hypothesis and conclude that there are differences between the two populations. Otherwise, the null hypothesis is not rejected and it appears that there is no difference between the two

populations.

Documento similar