DIAGNÓSTICO
NOMBRE COMÚN NOMBRE COMÚN
1.5 PROBLEMÁTICA AMBIENTAL MUNICIPAL.
In IAL, each feature’s discrimination ability can be estimated in this feature's one-dimensional
space. Features can be ordered by the ranking value of the feature discrimination ability. For
two-class classification problems (c2), based on Eq.(6.1), the discrimination ability of feature fi can
be given by
(6.3)
where μ1 and μ2 are the means of two classes, and s1 and s2 are within-class variances.
However, Eq.(6.3) is too simple to cope with multi-category classifications, because the
between-class scatter is difficult to describe merely by distance between patterns. Here, the
difference between the centres of these multiple classes should be replaced by standard
deviations of centres and standard deviations of patterns, so that the influence brought by classes
whose mean is not the smallest or the largest of all the means of classes can be measured.
Definition 6.1: Single Discriminability (SD) is a ratio between a feature by the standard deviation of all class centres and the sum of standard deviations of all patterns in each
class.
SD for both two-category and n-category classification problems can be integrated as
(6.4)
where n is the total number of classes, and std denotes the standard deviations, one for all patterns
belonging to cj in feature i, and the other for the vector consisting of the means of all classes in
feature i. Let x be the vector for standard deviation calculation, the standard deviation of x is:
(6.5)
where the vector , xk is the value of k th
pattern, and r is the total number of patterns.
Obviously, in Eq.(6.5), the part of is a distance between kth pattern and its mean. Thus, let dist replace this part, then Eq.(6.5) can be re-written as:
FEATURE ORDERING BASED ON LINEAR DISCRIMINANT
(6.6)
where denotes the distance of kth pattern in x and its mean .
Obviously, according to Eq.(6.6), the essence of SD indicates two kinds of distance, one is
the distance between classes, and the other is the distance within each class. These are similar to
FLD, where the further the distance between different classes and the nearer the distance between
each pattern and its class centre, the easier these classes can be distinguished. Here, easier means
the probability of correct prediction in pattern recognition is higher. For example, Figure 6.1
shows a normalized dataset which has two classes. The class centres are a and b, and x is one of
its features. According to a and b, the feature space of x can be divided into three parts: [0, a], (a,
b], and (b, 1]. Taking a random number produced by a classifier as a segmentation point, the
probability of a random number in [0, a] is P1=a/1=a; that in (a, b] is P2=(b-a)/1=b-a; and that in (b, 1] is P3=(1-b)/1=1-b. If we want to make the classification easier, we must enhance P2 and reduce P1 and P3. Therefore, for P1, a should be reduced; for P2, b should be increased and a
should be reduced; and for P3, b should be increased. As a result of reducing a and increasing b, the
distance between a and b will be larger.
x
Segmentation Point Segmentation Pointx
Segmentation Pointx
0 a b 1
0 a b 1
0 a b 1
Figure 6.1: Segmentations on x.This is similar to FLD, where the greater the standard deviation of a and b, the easier the
classification. In the example shown in Figure 6.1, if is the mean of a and b, the standard deviation of a and b is
Substituting for and simplifying: Since , we get (6.8)
Therefore, according to Eq.(6.8), if the distance between a and b is greater, the standard
deviation of a and b will also be greater. Namely, greater distance indicates easier classification,
and greater standard deviation will also imply easier classification.
If there are three or more classes in one feature space, Eq.(6.5) also works very well.
Assuming that there are two pattern sets, one is a= , and the other is
b= , , then relations of the mean and standard deviation are:
(6.9)
. (6.10)
According to Eq.(6.9) and Eq.(6.10), when n=3, if the two pattern sets are , and
, then (6.11) . (6.12)
FEATURE ORDERING BASED ON LINEAR DISCRIMINANT
Obviously, if and both increase, then will increase correspondingly. The key elements here are the distance between a1 and a2, and the
distance between the centres of , and a3. The further the distance, the lower the classification error rate is.
Similarly, when n=4, if the two pattern sets are , and , then
(6.13) (6.14)
Similar to the situation of n=3, the further the distance between a1, a2, a3, and a4, the better
the pattern recognition performance.
We assume that there is an n-category classification problem, then it needs n-1 segmentation
points. When n=k,
(6.15) Eq.(6.15) shows that the standard deviation depends on the distances between patterns. Then
when n=k+1, the two pattern sets are , and .
(6.16)
(6.17)
Therefore, when n=k+1, the also depends on the distance between patterns. More specifically, is decided by and the distance between centres of , and . However, and depend on the value of and Eventually, they depend on the distance between every two samples.
Therefore, based on these properties of standard deviation and mean, the calculation of SD
which is presented in Eq. (6.4) is obviously applicable in IAL feature discrimination ability
computing. During the process, standard deviations between multiple classes and within classes
can be calculated. Generally, the greater the "between" standard deviation means the greater the
total distance, then the lower the probability of errors are. Absolutely, in the mean while, the
"within" standard deviation which is influenced by the pattern distribution of each class, can
reflect the tightness of each class centre.
Generally speaking, to effectively distinguish patterns from each other, it is necessary to
ensure that the total distance between pattern centres should be the greatest. Therefore, SD,
which is inspired from FLD can deduce feature discrimination ability well. Thus, SD is suitable
to address the problems in multi-category classification. However, similar to FS, SD also
computes features one by one, therefore, it also cannot handle feature redundancy during the
feature ordering calculations.