The main issue in the development of fuzzy classifiers is the proper training, that is, the creation of an efficient rule base. The rule base can be derived from the expert previous knowledge or more easily can be derived directly from numerical input-output pairs. In the literature, many approaches have been proposed for generating fuzzy rules from numerical data, such as heuristic approaches [2, 70, 93], neuro-fuzzy techniques [103, 110, 111, 112, 150], clustering methods [3, 130], and genetic algorithms [19, 51, 68, 73].
In order to meet the PRTools base philosophy (see Appendix A), we are interested in existing batch-mode oriented approaches to automatically generate fuzzy rules from data, that are well-assessed and widely accepted, that need few free parameters to be specified, and that are associated with fast heuristic training methods. In our opinion, among the approaches existing in the literature, the technique that best meets these requirements is the Wang and Mendel method extended to classification problems, an adaptation of the well-known Wang and Mendel method for regression problems [154]. This method assumes that the fuzzy partitions of the input and output variables are provided by the user. A typical and easy way to achieve this is to resort to uniform fuzzy partitions consisting, for input variables, of a limited number of fuzzy sets, while, for output variables, of as many fuzzy sets as there are classes. So, the Wang and Mendel method is the training technique available in frbc [28]. This method seems to be simpler and with less construction time than a comparable neural network, maintaining the comparability of the results [106]. In addition, it allows to combine in the same framework both numerical and linguistic information [154].
In the following, the formal steps of the Wang and Mendel algorithm extended to classification problems are briefly introduced:
STEP A. generate a uniform fuzzy partition of the input domain, made of Qf, f= 1,…, F, fuzzy sets for each input variable f (usually Qf = Q = 3, 5, 7 or 9);
STEP B. generate a uniform fuzzy partition of the output variable, made of M fuzzy sets (one set for each class);
STEP C. generate an initial raw rule base (one rule for each training pattern);
STEP D. remove duplicated rules; STEP E. compute certainty factors;
STEP F. remove conflicting rules (i.e., rules having the same if part but different then parts) by keeping, for each set of conflicting rules, only the rule with the highest CF.
More in detail, this algorithm generates the fuzzy rule base assuming a uniform fuzzy partition for the input variables. Thus, the domain of each input variable is divided into Q=2N+1 regions, typically N∈{1, 2,3, 4} (STEP A). The length and the number of the regions may be different for the considered variables. A fuzzy membership function is then defined for each region. Typically this function has its maximum value in the middle point of the region and assumes its minimum value in the central points of the two neighboring regions, although different definitions are possible. Figure 2.2 shows an example of a fuzzy partition built according to the Wang and Mendel approach, on a generic variable y whose domain interval [y-, y+] has been divided into Q=5 regions (N=2).
Doing so, the thresholds identify a grid on the input variable space. So, for instance, if we have two input variables and we define two evenly spaced
thresholds for each variable, we obtain a uniform 9-area grid.
y
µ(y)
y$ y+
VL L M H VH
Figure 2.2. – An example of a fuzzy partition for the input variables, built according to the Wang and
Mendel approach.
Regarding STEP B, the fuzzy sets used to represent the output partition are usually fuzzy singletons, as we deal with a classification problem. Since each data pair generates a fuzzy rule in the rule base (STEP C), there will possibly be some duplicate rules and some conflicting rules, i.e., rules having the same if parts but different then parts. The duplicated rules are simply deleted (STEP D), while to solve a conflict, a CF is assigned to each conflicting rule of a set. The CF is defined so as to take into account the importance of each rule in the entire rule base (STEP E). The winning rule, within a conflicting set, is the one that has the maximum CF. The other rules of the set are discarded (STEP F).
From the field of data mining [5], the CF is tipically computed as the confidence of the fuzzy association rule
k
k
A
⇒C
δ , corresponding to the fuzzy rule Rk: if Akthen k
Cδ , where Ak represents the antecedent part of the fuzzy rule, and Cδk the
class appearing in the consequent. The CF is calculated as follows:
, , :t 1 , k N k t k t k t classCδ t
γ
=∑
x∈ Ω∑
= Ω (2.4)where Ωk t, is the strength of activation of the antecedent of rule R
k for the t-th
pattern, and N is the number of training patterns.
However, in past research, many heuristic measures have been proposed to specify the weight of a fuzzy classification rule [72, 93, 166]. Nozaki et al. [115]
proposed a method of learning rule weight using Reward & Punishment, in which, considering the classification of a pattern using the single winner FRM, the weight of the winner rule is increased or decreased depending on whether the pattern has been correctly classified or not. In other relevant methods [66, 72], the computation consists of two phases. First, the certainty factor is calculated as the confidence of the fuzzy rule (as in Equation (2.4)), then, the certainty factor is refined with a measure that depends on the specific method, with the aim to improve the classification performance.
We recall that any shape and number for membership functions can be selected. Clearly, the higher the number of membership functions, the bigger the accuracy obtained. On the other hand, a large number of membership functions leads to a large rule base dimension and, consequently, to a higher complexity.