Fecha de Evaluación:
NOMBRE DEL ESTABLECIMIENTO: NOMBRE DEL DIRECTOR O RECTOR:
77
32.07 position =
Class interval that contain
Class boundary that contain =24.5 -29.5
26.81
78
The range is defined as the difference between the highest and smallest observation in the data. It is the crudest measure of dispersion. The range is a measure of absolute dispersion and as such cannot be usefully employed for comparing the variability of two distributions expressed in different units.
Range = Xmax – Xmin
Where, xmax = highest (maximum) value in the given distribution. Xmin = lowest (minimum) value in the given distribution.
In the example given above (the two data sets) The range of data in set 1 is 70-30 =40
The range of data in set 2 is 53-48 =5 Characteristics
i. Since it is based upon two extreme cases in the entire distribution, the range may be considerably changed if either of the extreme cases happens to drop out, while the removal of any other case would not affect it at all.
ii. It wastes information for it takes no account of the entire data.
iii. The extremes values may be unreliable; that is, they are the most likely to be faulty
iv. Not suitable with regard to the mathematical treatment required in driving the techniques of statistical inference.
Variance and Standard Deviation
Variance is a measure of the spread of the original values about the mean. When we are concerned with a population, the variance is written in terms of the Greek letter and is denoted by -2 (sigma square)
The variance is a very useful measure of variability because it uses the information provided by every observation in the sample and also it is very easy to handle mathematically. Its main disadvantage is that the units of variance are the square of the units of the original observations.
79
However, a more useful measure of the spread or variability in a set of data is the standard deviation, which is defined as the square root of the variance.
Standard Deviation (SD) = Variance
Since the standard deviation is the square root of the variance 2, the standard deviation is denoted by and is found from the formula.
Population standard deviation - =
N
(x)2OR 2
2
2) ( )
( N
x x
N
One special advantage of working with the standard deviation is that it is measured in the same units as the original data. Thus, if the original set of numbers represent weights of a certain type of item, then both the mean and standard deviation are measured in weights.
Sample standard deviation (s) =
1 )
( 2
nx x
or ( 1)
) ( )
( 2 2
n n
x x
n
That is, instead of dividing by n data points, we divide by n-1. Just as -2 and -represent the variance and standard deviation of a population, respectively, we use the symbols s2 and s to stand for the variance and standard deviation, respectively, of a sample.
Coefficient of Variation (C.V)
This is a dimensionless quantity that measures the relative variation between two series observed in different units. Comparison of two distributions with different means and unit of measurement is done using the coefficient of variation.
It is defined as the ratio of the standard deviation and the mean of a set of data expressed as a percentage.
80 100
x
. X
V S C
The distribution with smaller C.V is said to be better Examples on Measures of Dispersion
UNGROUP DATA
Below is the average of 10 Heads of household randomly selected from a community for Covid-19 laboratory test: 54, 59, 35, 41, 46, 25, 47, 60, 54, 46.
Find the (i) Range (ii) Mean deviation from the mean (iii) Mean deviation from the median (iv) variance (v) standard deviation (vi) coefficient of variation.
SOLUTION i. Range = X (max) - X (min.)
Range = 60 – 25 = 35
ii. Mean Deviation from mean= MDX =
n X
X Mean =
10
46 ....
59 54
n X X
= 46.7
MDX =
n X
X =10
7 . 46 46 ....
7 . 46 59 7 . 46
54
(7.3 + 12.3 + 11.7 + 5.7 + 0.7 + 21.7 + 0.3 + 13.3 + 7.3 + 0.7)/10
=
10
81 = 8.10
iii. Mean Deviation from median = MD Xˆ Array: 25, 35, 41, 46, 46, 47, 54, 54, 59, 60
Median = 46.5
2 2 1
2
n
n X X
Mean Deviation from median =
10
5 . 46 46 5
. 46 59 5 . 46 54
ˆ
MDX
=
10
5 . 0 5 . 7 5 . 13 5 . 0 5 . 21 5 . 0 5 . 5 5 . 11 5 . 12 5 .
7
81
= 10 81 = 8.1
(iv) Variance =
n X
X 2=
10
) 7 . 46 46 ( ....
) 7 . 46 54
( 2 2
= 10.87
PHS702 COURSE
GUIDE
(v) Standard Deviation= S =
n X
X 2=
10
) 7 . 46 46 ( ....
) 7 . 46 54
( 2 2
(vi) Coefficient of Variation = C.V = x 100 X
S
= x 100
7 . 46
37 . 10
= 22.21
GROUP DATA
The table below shows the frequency distribution of clinical data class Frequency (f)
0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
2 5 8 12
9 5 1
Find the mean deviation from the mean, variance, standard deviation and coefficient of variation for the data.
82
SOLUTION
Classes X f fx X X X X f X X 0 – 10
10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
5 15 25 35 45 35 65
2 5 8 12
9 5 1
10 75 200 420 405 275 65
- 29.52 - 19.52 - 9.52
0.48 10.48 20.48 30.48
29.52 19.52 9.52 0.48 10.48 20.48 30.48
59.04 97.6 76.16
5.76 94.32 102.4 30.48
871.43 381.03 90.63
0.23 109.83 419.43 929.03
1742.86 1905.15 725.04
2.76 988.47 2097.15
929.03
52 . 42 34 1450
f X fx
i. Mean Deviation from the mean =
f X X f
=
42 76 . 365
= 11.089
ii. Variance=
n
i
i X
n X S
1 2 1 2
= (7665.42)/42 = 182.51 iii. Standard deviation = = 13.51
iv. Coefficient of variation = . x 100 X
V S C
= 39.14%
i. Skewness: If extremely low or extremely high observations are present in a distribution, then the mean tends to shift towards those scores. Based on the type
83
of skewness, (i)Negatively skewed distribution: occurs when majority of scores are at the right end of the curve and a few small scores are scattered at the left end.
ii. Positively skewed distribution: Occurs when the majority of scores are at the left end of the curve and a few extreme large scores are scattered at the right end.
iii. Symmetrical distribution: It is neither positively nor negatively skewed. A curve is symmetrical if one half of the curve is the mirror image of the other half.
In unimodal (one-peak) symmetrical distributions, the mean, median and mode are identical. On the other hand, in unimodal skewed distributions, it is important to remember that the mean, median and mode occur in alphabetical order when the longer tail is at the left of the distribution or in reverse alphabetical order when the longer tail is at the right of the distribution.
4.0 CONCLUSION
Apart from using frequency distribution tables and graphical display for data presentation, data summaries using different measures can be employed to summarized data.
5.0 SUMMARY
In this module, further exploratory analysis was studied using data summary. Statistical measures for summarizing data were identified and described. This include: measures of location, partitions, variation.