4. Sesiones propuestas
4.1. Sesiones para Secundaria
4.1.10. Sesión 10: Educación para la Salud y el Ocio
Septic Shock is defined as “...a serious, abnormal condition that occurs when an overwhelming infection
leads to low blood pressure and low blood flow. Vital organs, such as the brain, heart, kidneys, and liver may not function properly or may fail. Decreased urine output from kidney failure may be one symptom.”
[MedlinePlus Medical Encyclopedia, 2004]. Septic shock is associated with a mortality rate of around 50%, and is still an important research subject for medical experts and data analysts [Paetz, 2003]. During the Deutsche Forschungsgemeinschaft (DFG) sponsored project MEDAN7, medical experts and data analysts cooperated to gather data of septic shock patients. The H16 data set contains the sixteen most measured physiological parameters of 138 septic shock patients. Of the 138 patients, 68 patients survived.
Paetz [2002] used a Fuzzy Rectangular Basis Function Dynamic Decay Adjustment Neural Network (Fuzzy-RecBF-DDA-NN) [Berthold and Huber, 1995; Huber and Berthold, 1995] to learn rules for predicting whether a patient will survive or not. The results reported were obtained from 5-fold cross validation. The rule sets attained classification accuracy with mean and standard deviation84.02% and 4.44%, respectively. The average rule set size was 16 rules. We obtained the same test and training sets
Table 11.6: Performance of different configurations of FCF on the MEDAN data.
Mean Std Dev Mean Std Dev
RecBF 84.02 4.44 16.0 N/A
HI 87.93 3.70 40.8 7.00
GO 87.00 3.49 26.2 3.03
SR 84.48 1.66 3.2 0.84
Accuracy Rule Set Size
Config
Table 11.7:Different configurations of FCF on use with the MEDAN data.
Config Beam Width Spec. Mod. Inf. Thres. Weighted SCL Eval. Meth.
HI 1 FuzzConRI 0.1 T F lscontent
GO 4 FuzzConRI 0.1 F F lscontent
SR 3 FuzzyBexa 0.4 F T accuracy
as used in the experiments from the author of reference [Paetz, 2002] for direct comparison with FCF. Table 11.6 reports the results for the three different configurations of FCF shown in Table 11.7, as well as the results from reference [Paetz, 2002].
The goal of configurations HI and GO was high classification accuracy. The best classification accuracy obtained by FCF was87.93%, which is 3.91% better than the previous result obtained by the NN (neural network). The weighted cover resulted in the induction of many overlapping rules, and the resulting rule set has relatively many rules (40.8). If no weighted cover is used, as in configuration GO, a good overall result is obtained, with87.0% classification accuracy and 26.2 rules on average. For configuration SR we used the simultaneous concept learning induction strategy. The rule set classification accuracy for these rules were only slightly better than that of the NN, however, the rule set sizes were dramatically smaller—on average 3.2 rules per rule set. An example of one such rule set is the following:
IF [!BlutdruckSystolisch.mf0][!Temperatur.mf4][!Thrombozyten.mf0][!Urinmenge.mf0] THEN class.ueberlebt
ELSE IF [!BlutdruckDiastolisch.mf4] THEN class.verstorben ELSE class.ueberlebt
FCF was thus able to improve significantly on the previous results, both with respect to rule set com- prehensibility and rule set classification accuracy8.
11.6
Summary
In this chapter we provided arguments why set covering is a good methodology for the induction of fuzzy rules. Crisp covering algorithms are a special case of fuzzy covering algorithms, and as such fuzzy covering algorithms are at least as powerful as crisp covering algorithms. We also provided theoretical
8
In English, “BlutdruckSystolisch” means systolic blood pressure, “Temperature” means temperature, “Thrombozyten” means platelets, “Urinmenge” means urine quantity, “ueberlebt” means survived, “BlutdruckDiastolisch” means diastolic blood pressure, and “verstorben” means deceased.
arguments why fuzzy covering algorithms are more powerful, for example that, unlike for crisp rules, the decision boundary in the fuzzy case need not be axis-parallel. We also proposed a series of experiments to demonstrate cases where crisp rule induction fail, but fuzzy rules provide good results. We provided an empirical evaluation of different fuzzy methods on benchmark data sets to substantiate our claim that fuzzy set covering often perform better than other fuzzy learning methods, such as fuzzy decision trees or beam search, for example. FCF was able to convincingly outperform the other fuzzy classifiers with respect to classification performance. In addition, FCF obtained significantly less complex rule sets. Finally, we provided two applications where FCF improved upon the performance of previously used methods.
CHAPTER
12
Conclusions and Directions
for Future Research
The objective of this dissertation was to prove that set covering can be applied successfully for the induction of fuzzy classification rules from training data. Set covering has proven to be a very successful concept learning methodology in the crisp case, and many different algorithms applying this approach have been proposed. Fuzzy sets are a generalization of crisp sets, and a crisp set is a special case of a fuzzy set. As such, many different methods for the induction of fuzzy rules have been proposed. Some of the more successful induction methodologies are fuzzy decision trees, genetic algorithms, and partitioning methods. One drawback of most of these methods is that the induced rule sets are often not very comprehensible due to their rather large number of rules. There are also comparatively few methods that allow both the induction of incomplete rules. Furthermore, most methods concentrate on extracting fuzzy set membership functions, and thus forgo the use of fuzzy sets as linguistic labels with meaning to domain experts. However, according to Guillaume [2001], the use of linguistic terms, small rule sets, and the induction of incomplete rules are exactly the criteria for obtaining comprehensible fuzzy rule sets.
By developing the fuzzy set covering rule induction methodology, this dissertation addressed the prob- lem of inducing accurate, but also comprehensible fuzzy classification rules. Thus, we have extended the different classes of rule induction methods, and added fuzzy set covering to it. We have also devel- oped four new fuzzy rule induction algorithms implementing this new methodology. The first algorithm, FUZZYBEXA, inherits its structure from its crisp ancestor BEXA. FUZZYBEXA induces a single rule through a conjunction specialization process based on excluding linguistic terms. It starts with the most general conjunction in its description language, and expand this allowing a local beam search until cer- tain stopping criteria are met. We have also proved various characteristics of the algorithm, for example that its description language induces a lattice, and that the fuzzy extension operator is an order-preserving mapping from descriptions to associated instance sets.
We also presented several experiments with FUZZYBEXA. An experimental evaluation with benchmark data sets investigated its different learning parameters. We measured the effect of the beam width, FUZZYBEXA’s sensitivity to noise, it’s pre- and post-training sensitivity to the value of theα-cut, and the effect of its various stop growth tests. The principle results are that FUZZYBEXA’s search effort
grows at most linearly with increasing beam width, that it behaves well in the presence of noise, that it is not overly sensitive to the antecedent threshold, and that the use of the stop growth criteria significantly improves the search in terms of rule set complexity and search effort. The experiments also show that although FUZZYBEXA’s hypothesis space for most problems can be very large, the algorithm easily copes with normal size data sets, and that even very large data sets can be successfully searched. The conjunction evaluation measure plays an important role to guide the search for single rules. As such, we proposed a range of conjunction evaluation functions specially adapted to the fuzzy case. We also conducted experiments to investigate their performance for different data sets. The results showed that the evaluation function should be matched to the data set’s characteristics, and that no single evaluation function always performs best. However, our proposed Accuracy evaluation function performed very well in most circumstances, especially as measured by the size of the rule set.
We also presented a survey of different algorithms for the induction of fuzzy rules. These algorithms can be grouped into seven classes, depending on their induction strategy: greedy incremental rule learners, divide-and-conquer, similarity, stochastic, partitioning, hierarchical, and gradient descent. Of course there also some algorithms that do not fit neatly into one of these classes. We provided a comparison between the different classes and FUZZYBEXA, as an example of a fuzzy set covering algorithm. None of the algorithms have all of FUZZYBEXA’s characteristics, in fact, most have very little in common with FUZZYBEXA.
Since one new algorithm is not enough to establish a paradigm, we developed more fuzzy algorithms applying the set covering approach, FUZZYSEEDSEARCH, FUZZCONRI, and FUZZYPRISM. FUZZ- CONRI and FUZZYPRISM use FuzzyCAL as description language, and employ append as special- ization operator. FUZZYBEXA and FUZZYSEEDSEARCH use FuzzyAL as description language, and employ exclude as specialization operator.
FCF was introduced as a general framework for set covering algorithms, both crisp and fuzzy. The top layers of the framework encapsulate everything that is similar between different set covering algorithms. This include the fuzzy set covering approach, and search heuristics such as conjunction evaluation, beam search, prepruning, and efficiency improvements. Any improvement to the top layers, or the addition of new or more advanced heuristics, will automatically benefit all algorithms that fit in the framework. FCF allows different covering algorithms to be characterized and compared. Thus, FCF allows the rapid development of new covering algorithms, since the designer need to concentrate only on what differentiates his algorithm from the rest. To demonstrate the applicability of the framework we showed that all four proposed covering algorithms fit within the framework. We also characterised each algorithm and described its various properties.
To the best of our knowledge, there existed no algorithm for the induction ordered fuzzy rule sets, or fuzzy decision lists. FUZZYBEXAII is a novel fuzzy rule induction algorithm following the simultaneous concept learning approach, and is capable of inducing decision lists (ordered rule sets). We showed that decision lists can compare favourably to unordered rule sets under the right conditions. If an appropriate conjunction evaluation function is used, the induced rule set can be very descriptive and highly accurate, while being extremely compact.
To motivate the use of fuzzy set covering we have provided arguments for the use of fuzzy set covering as opposed to crisp set covering of other fuzzy rule learning methods. Fuzzy set covering as a generalization of crisp set covering is far more powerful, and includes crisp set covering as a special case. We have also compared fuzzy set covering to other algorithms that are also capable of inducing incomplete rules and use fuzzy sets as linguistic labels. On average, FCF outperforms methods such as decision trees (e.g. FID) or beam search (e.g. FBS) in terms of classification accuracy. However, at the same time FCF significantly outperforms these methods in terms of rule set comprehensibility. Finally, we provided results on two real world applications where FCF improved upon the state of the art. In the next section we list the major scientific contributions made by this dissertation. We then provide some directions for future research in Section 12.2, and Section 12.3 concludes the dissertation.
12.1
Scientific Contributions
We list the major scientific contributions made by this dissertation:
1. Establishing a new paradigm for the induction of fuzzy classification rules (“Fuzzy rule induction in a set covering framework”, [Cloete and van Zyl, 2006]);
2. Narrowing the gap between the symbolic and sub-symbolic machine learning communities (“A machine learning framework for fuzzy set covering algorithms”, [Cloete and van Zyl, 2004c]); 3. The first ever algorithm for the induction of fuzzy decision lists
(“Simultaneous concept learning of fuzzy rules”, [van Zyl and Cloete, 2004f]); 4. A general fuzzy set covering framework
(“Specialization models for a general fuzzy set covering framework”, [van Zyl and Cloete, 2006]); 5. Novel fuzzy rule evaluation functions, and their importance during rule induction
(“Heuristic functions for learning fuzzy conjunctive rules”, [van Zyl and Cloete, 2004c], “Evalu- ation function guided search for fuzzy set covering”, [Cloete and van Zyl, 2004a]);
6. The algorithm FUZZYBEXAbased on exclusion
(“Fuzzy set covering with FuzzyBexa”, [Cloete and van Zyl, 2004b]); 7. The algorithm FUZZCONRI that induce rules in FuzzyCAL
(“An inductive algorithm for learning conjunctive fuzzy rules”, [van Zyl and Cloete, 2004d], “FuzzConRI - a fuzzy conjunctive rule inducer”, [van Zyl and Cloete, 2004a]);
8. The algorithm FUZZYPRISM that uses fuzzy information gain
(“FuzzyPRISM: a specialization model for the FuzzyBexa framework”, [van Zyl and Cloete, 2004b]);
9. Encoding FuzzyAL rules as prior knowledge in a neural network
(“Prior knowledge for fuzzy knowledge-based artificial neural networks from fuzzy set covering”, [van Zyl and Cloete, 2004e]).