BACKGROUND AND PURPOSE
Another test for normality that is essentially equivalent to the Shapiro-Wilk and Shapiro-Francía tests is the probability plot correlation coefficient test described by Filliben (1975). This test meshes perfectly with the use of probability plots, because the essence of the test is to compute the usual correlation coefficient for points on a probability plot. Since the correlation coefficient is a measure of the linearity of the points on a scatterplot, the probability plot correlation coefficient, like the SW test statistic, will be high when the plotted points fall along a straight line and low when there are significant bends and curves in the probability plot. Comparison of the Shapiro-Wilk and probability plot correlation coefficient tests has indicated very similar statistical power for detecting non-normality (Ryan and Joiner, 1990).
It should be noted that although some statistical software may not compute Filliben’s test directly, the usual Pearson’s correlation coefficient computed on the data pairs used to construct a probability plot will provide a very close approximation to the Filliben statistic. Some users may find this latter correlation easier to compute or more accessible in their software.
PROCEDURE
Step 1. List the observations in order from smallest to largest, denoting x(i) as the ith smallest rank
statistic in the data set. Then let n = sample size and compute the sample mean ( x) and the standard deviation (s).
Step 2. Consider a random sample drawn from a standard normal distribution. The ith rank statistic of this sample is fixed once the sample is drawn, but beforehand it can be considered a random variable, denoted as X(i). Likewise, by considering all possible datasets of size n that might be
drawn from the normal distribution, one can think of the sampling distribution of the statistic X(i). This sampling distribution has its own mean and variance, and, of importance to the
probability plot correlation coefficient, its own median, which can be denoted Mi.
To compute the median of the ith rank statistic, first compute intermediate probabilities mi for
i = 1…n using the equation:
mi = 1−
( )
.5 1n i−.3175(
)
(
n+.365)
.5( )
1n for i=1 for 1<i<n for i=n [10.13]Then compute the medians Mi as the standard normal quantiles or z-scores associated with the
intermediate probabilities mi. These can be determined from Table 10-1 in Appendix D or
computed according to the following equation, where Φ−1represents the inverse of the
standard normal distribution:
Step 3. With the rank statistic medians in hand, calculate the arithmetic mean of the Mi’s, denoted M, and the intermediate quantity Cn, given by the equation:
Cn = Mi2
i=1
n
∑
−nM2 [10.15]Note that when the dataset is “complete” (meaning it contains no non-detects, ties, or censored values), the mean of the order statistic medians reduces to M =0 . This in turn reduces the calculation of Cn to:
Cn = Mi2
i=1
n
∑
[10.16]Step 4. Finally compute Filliben’s probability plot correlation coefficient:
r= x( )i Mi i=1 n
∑
−nxM Cn⋅s n−1 [10.17]When the dataset is complete, the equation for the probability plot correlation coefficient also has a simplified form:
r= x( )i Mi
i=1
n
∑
Cn⋅s n−1 [10.18]Step 5. Given the level of significance (α), determine the critical point (rcp) for Filliben’s test with
sample size n from Table 10-5 in Appendix D. Compare the probability plot correlation coefficient (r) against the critical point (rcp). If r ≥ rcp, conclude that normality is a reasonable
model for the underlying population at the α-level of significance. If, however, r < rcp, reject
the null hypothesis and conclude that another distributional model would provide a better fit.
►EXAMPLE 10-3
Use the data of Example 10-1 to compute Filliben’s probability plot correlation coefficient test at the α = .01 level of significance.
SOLUTION
Step 1. Order and rank the nickel data from smallest to largest and list, as in the table below. The sample size is n = 20, with sample mean x = 169.52 and the standard deviation s = 259.72. Step 2. Compute the intermediate probabilities mi from equation [10.13] for each i in column 3 and
the rank statistic medians, Mi, in column 4 by applying the inverse normal transformation to
Step 3. Since this sample contains no non-detects or ties, the simplified equations for Cn in equation
[10.16] and for r in equation [10.18] may be used. First compute Cn using the squared order
statistic medians in column 5:
[
3.328+1.926+ +3.328]
= 4.138= K
n
C
Step 4. Next compute the products x( )i × Miin column 6 and sum to get the numerator of the correlation coefficient (equal to 3,836.81 in this case). Then compute the final correlation coefficient: r =3,836.81 4.138 ×259.72 19=0.819 i x(i) mi Mi (Mi)2 x(i)×××× Mi 1 1.0 .03406 –1.8242 3.328 –1.824 2 3.1 .08262 –1.3877 1.926 –4.302 3 8.7 .13172 –1.1183 1.251 –9.729 4 10.0 .18082 –0.9122 0.832 –9.122 5 14.0 .22993 –0.7391 0.546 –10.347 6 19.0 .27903 –0.5857 0.343 –11.129 7 21.4 .32814 –0.4451 0.198 –9.524 8 27.0 .37724 –0.3127 0.098 –8.444 9 39.0 .42634 –0.1857 0.034 –7.242 10 56.0 .47545 –0.0616 0.004 –3.448 11 58.8 .52455 0.0616 0.004 3.621 12 64.4 .57366 0.1857 0.034 11.959 13 81.5 .62276 0.3127 0.098 25.488 14 85.6 .67186 0.4451 0.198 38.097 15 151.0 .72097 0.5857 0.343 88.445 16 262.0 .77007 0.7391 0.546 193.638 17 331.0 .81918 0.9122 0.832 301.953 18 578.0 .86828 1.1183 1.251 646.376 19 637.0 .91738 1.3877 1.926 883.941 20 942.0 .96594 1.8242 3.328 1718.408
Step 5. Compare Filliben’s test statistic of r = 0.819 to the 1% critical point for a sample of size 20 in Table 10-5 of Appendix D, namely rcp = 925. Since r < 0.925, the sample shows significant
evidence of non-normality by the probability plot correlation coefficient. The data should be transformed and the correlation coefficient re-calculated before proceeding with further statistical analysis. ◄