The basic concept of probability plays a significant role in our everyday life. We try to determine the probability of rain and the prospects of our promotion, the odds that the Australian cricket team will win the next test match, and the likelihood of winning a million dollars in Tattslotto.
The concept of probability has a long history that goes back thousands of years when words like ‘probably’, ‘likely’, ‘maybe’, ‘perhaps’ and ‘possibly’ were introduced into spoken languages (Good, 1959). However, the mathematical theory of probability was formulated only in the 17th century.
How can we define probability?
The probability of an event is the proportion of cases in which the event occurs (Good, 1959). Probability can also be defined as a scientific measure of chance. Detailed analysis of modern probability theory can be found in such well-known textbooks as Feller (1957) and Fine (1973). In this chapter, we examine only the basic ideas used in representing uncertainties in expert systems.
Probability can be expressed mathematically as a numerical index with a range between zero (an absolute impossibility) to unity (an absolute certainty). Most events have a probability index strictly between 0 and 1, which means that Table 3.1 Quantification of ambiguous and imprecise terms on a time-frequency scale
Ray Simpson (1944) Milton Hakel (1968)
Term Mean value Term Mean value
Always 99 Always 100
Very often 88 Very often 87
Usually 85 Usually 79
Often 78 Often 74
Generally 78 Rather often 74
Frequently 73 Frequently 72
Rather often 65 Generally 72
About as often as not 50 About as often as not 50
Now and then 20 Now and then 34
Sometimes 20 Sometimes 29
Occasionally 20 Occasionally 28
Once in a while 15 Once in a while 22
Not often 13 Not often 16
Usually not 10 Usually not 16
Seldom 10 Seldom 9
Hardly ever 7 Hardly ever 8
Very seldom 6 Very seldom 7
Rarely 5 Rarely 5
Almost never 3 Almost never 2
Never 0 Never 0
57 BASIC PROBABILITY THEORY
each event hasat leasttwo possible outcomes: favourable outcome or success, and unfavourable outcome or failure.
The probability of success and failure can be determined as follows:
PðsuccessÞ ¼ the number of successes
the number of possible outcomes ð3:1Þ
PðfailureÞ ¼ the number of failures
the number of possible outcomes ð3:2Þ
Therefore, ifsis the number of times success can occur, andfis the number of times failure can occur, then
PðsuccessÞ ¼p¼ s sþf ð3:3Þ PðfailureÞ ¼q¼ f sþf ð3:4Þ and pþq¼1 ð3:5Þ
Let us consider classical examples with a coin and a die. If we throw a coin, the probability of getting a head will be equal to the probability of getting a tail. In a single throw,s¼f ¼1, and therefore the probability of getting a head (or a tail) is 0.5.
Consider now a dice and determine the probability of getting a 6 from a single throw. If we assume a 6 as the only success, thens¼1 andf ¼5, since there is just one way of getting a 6, and there are five ways of not getting a 6 in a single throw. Therefore, the probability of getting a 6 is
p¼ 1
1þ5¼0:1666
and the probability of not getting a 6 is
q¼ 5
1þ5¼0:8333
So far, we have been concerned with events that are independent and mutually exclusive (i.e. events that cannot happen simultaneously). In the dice experiment, the two events of obtaining a 6 and, for example, a 1 are mutually exclusive because we cannot obtain a 6anda 1 simultaneously in a single throw. However, events that are not independent may affect the likelihood of one or the other occurring. Consider, for instance, the probability of getting a 6 in a
single throw, knowing this time that a 1 has not come up. There are still five ways of not getting a 6, but one of them can be eliminated as we know that a 1 has not been obtained. Thus,
p¼ 1 1þ ð51Þ
LetAbe an event in the world andBbe another event. Suppose that eventsA
andBare not mutually exclusive, but occur conditionally on the occurrence of the other. The probability that eventAwill occur if eventBoccurs is called the conditional probability. Conditional probability is denoted mathematically as
pðAjBÞin which the vertical bar represents GIVEN and the complete probability expression is interpreted as ‘Conditional probability of eventAoccurring given that eventBhas occurred’.
pðAjBÞ ¼the number of timesAandBcan occur
the number of timesBcan occur ð3:6Þ
The number of timesAandBcan occur, or the probability that bothAandB
will occur, is called the joint probability of A and B. It is represented mathematically aspðA\BÞ. The number of waysBcan occur is the probability ofB,pðBÞ, and thus
pðAjBÞ ¼pðA\BÞ
pðBÞ ð3:7Þ
Similarly, the conditional probability of eventBoccurring given that eventA
has occurred equals
pðBjAÞ ¼pðB\AÞ
pðAÞ ð3:8Þ
Hence,
pðB\AÞ ¼pðBjAÞ pðAÞ ð3:9Þ The joint probability is commutative, thus
pðA\BÞ ¼pðB\AÞ Therefore,
pðA\BÞ ¼pðBjAÞ pðAÞ ð3:10Þ 59 BASIC PROBABILITY THEORY
Substituting Eq. (3.10) into Eq. (3.7) yields the following equation:
pðAjBÞ ¼pðBjAÞ pðAÞ
pðBÞ ; ð3:11Þ
where:
pðAjBÞis the conditional probability that eventAoccurs given that event
Bhas occurred;
pðBjAÞis the conditional probability of eventBoccurring given that eventA
has occurred;
pðAÞis the probability of eventAoccurring;
pðBÞis the probability of eventBoccurring.
Equation (3.11) is known as theBayesian rule, which is named after Thomas Bayes, an 18th-century British mathematician who introduced this rule.
The concept of conditional probability introduced so far considered that eventAwas dependent upon eventB. This principle can be extended to eventA
being dependent on a number of mutually exclusive events B1;B2;. . .;Bn. The
following set of equations can then be derived from Eq. (3.7):
pðA\B1Þ ¼pðAjB1Þ pðB1Þ pðA\B2Þ ¼pðAjB2Þ pðB2Þ .. . pðA\BnÞ ¼pðAjBnÞ pðBnÞ or when combined: Xn i¼1 pðA\BiÞ ¼ Xn i¼1 pðAjBiÞ pðBiÞ ð3:12Þ
If Eq. (3.12) is summed over an exhaustive list of events forBias illustrated in
Figure 3.1, we obtain Xn
i¼1
pðA\BiÞ ¼pðAÞ ð3:13Þ
It reduces Eq. (3.12) to the following conditional probability equation:
pðAÞ ¼X
n i¼1
If the occurrence of eventAdepends on only two mutually exclusive events, i.e.Band NOTB, then Eq. (3.14) becomes
pðAÞ ¼pðAjBÞ pðBÞ þpðAj:BÞ pð:BÞ; ð3:15Þ where:is the logical function NOT.
Similarly,
pðBÞ ¼pðBjAÞ pðAÞ þpðBj:AÞ pð:AÞ ð3:16Þ Let us now substitute Eq. (3.16) into the Bayesian rule (3.11) to yield
pðAjBÞ ¼ pðBjAÞ pðAÞ
pðBjAÞ pðAÞ þpðBj:AÞ pð:AÞ ð3:17Þ Equation (3.17) provides the background for the application of probability theory to manage uncertainty in expert systems.