• No se han encontrado resultados

• Although IBLStreams has achieved satisfying results, in terms of performance

and recovery, IBL methods still have the potential for tremendous improve- ments. One may think of constructing a hybrid approach in which the case base does not only contain instances but also induced rules; such an approach has the potential to improve IBL methods for two reasons: (i) rules can better summarize well occupied regions of the instance space, thus removing the need to keep numerous redundant examples from the same region and (ii) rules can be used as a temporal substitute for the case base when drifts occur and a large portion of examples is removed from the case base.

• Evolving fuzzy pattern trees, proposed in this thesis, focus only on binary clas-

sification problems; this calls for an extension that learns from streams of re- gression or multiclass classification problems. Motivated by the modification on the pattern tree’s induction method proposed by Senge and H¨ullermeier [147], we plan to induce an ensemble of trees, a forest, in parallel using techniques similar to the Hoeffding race [115], besides trying to enforce the diversity of the induced trees. In such a way, drifts can be handled through the removal and addition of trees in an adaptive way; trees in this ensemble remain interpretable when inspected individually.

• The proposed recovery analysis shows the potential toward discovering the hid-

den resemblance between different learning methods based on their recovery pat- tern; such a similarity is discovered when comparing AMRules with FIMTDD, which can only be explained by the equivalence between the tree, induced by FIMTDD, and the rules, induced by AMRules. This equivalence becomes more obvious when knowing that both methods use Hoeffding’s bound and the Page- Hinkley test in the same way during the induction. We recommend a further application of the recovery analysis on more learning paradigms and methods in order to discover the points of strength and weakness in the ability of each approach to adapt and recover. Recovery analysis also shows the potential to

be extended for unsupervised learning methods in general and clustering in par- ticular in order to evaluate the capability of a clustering method to recover after a change in the data generating distribution.

• The proposed survival analysis approach assumes a fixed set of parallel event

streams, i.e., events are emitted from a fixed set of objects; this restriction causes the streams of events to contain recurrent events only.

One may consider the case where the stream emitting objects are allowed to be removed, after an event, or to be censored, after being lost. In such a setting, the risk set becomes changing with time and events are not any more restricted to be recurrent. This requires a new formulation of the likelihood function in order to allow the risk set to be dynamically changing.

Another extension of survival analysis on data streams may consider the charac- teristic properties of the events, e.g., the magnitude of an earthquake; utilizing these properties in survival analysis may have a positive effect on finding the prognostic factor.

Appendix A

Methods

In this thesis, we compare different adaptive learning algorithms for different learning tasks; each of these methods belongs to a paradigm that exhibits unique learning and recovery patterns, see Table A.1 for a brief summary of the studied methods. All used approaches, except FLEXFIS which is implemented using the fuzzy logic toolbox provided in Matlab, are implemented and offered by the MOA framework, which is described in Appendix B. IBLStreams and eFPT can be downloaded as extensions for the framework, whereas the rest of the methods can be acquired from the 2013.11 release of MOA.

A.1

Adaptive Hoeffding Tree

The Hoeffding tree [56] is an incremental decision tree approach, tailored for classi- fication on data streams. Upon the arrival of a new training example, the algorithm examines each inner node of the tree and decides whether the current split (attribute) is still optimal, or whether an alternative split appears to be advantageous. The de- cision, made while choosing the optimal splitting attribute, is based on statistical hypothesis testing. More specifically, Hoeffding’s inequality [81] is used to check whether the information gain of an alternative attribute is significantly higher than the gain of the currently chosen attribute.

Hoeffding’s inequality states that, with probability 1− δ, the difference between the observed mean and the true mean, for a random variable r of the range R, would not exceed ϵ after seeing n observations, such that

ϵ =

R2ln 1/δ

The Hoeffding tree uses this bound to compare the difference between the infor- mation gains ¯G(Xa) and ¯G(Xb) of the two best splitting attributes Xaand Xb, respec-

tively. Assume that the attribute Xais better than Xbwith ∆ ¯G = ¯G(Xa)− ¯G(Xb) > 0.

On the arrival of new data samples, Hoeffding’s bound guarantees that the true differ- ence (in information gain) ∆G and the empirical difference ∆ ¯G satisfy the inequality

∆G > ∆ ¯G− ϵ, with probability 1 − δ. Observing ∆ ¯G > ϵ at any time means that

∆G > 0, thus selecting the attribute Xa is now guaranteed to cause the largest

information gain.

An adaptive version of the Hoeffding tree (AdpHoef) has been presented in [23]. This algorithm maintains a drift detection statistic in each node to judge the com- patibility of the current tree or subtree with the recently received data. For each of these nodes, an alternative tree is maintained and learned on the recent data only; this alternative subtree replaces the initial subtree, rooted at that node, whenever the node’s drift detector signals a change. This variant of Hoeffding trees uses the ADWIN [22] technique, a parameter-free method for detecting the rate of change in data streams. In this thesis, AdpHoef algorithm is applied for binary and multiclass classification problems.

Documento similar