Francisco J. Ferrer Troyano
Jesús S. Aguilar, José C. Riquelme
University of Sevilla
ferrer@lsi.us.es
FACIL:
Incremental Rule Learning
from Data Streams
Contents
Motivation
FACIL
The hybrid knowledge model
Three user parameters
IIL updating
The moderate growth d istance
Implicit and explicit forgetting heuristics
The multi-strategy classification method
Experiments
Future work
Motivation
Why Classify? Why Rules?
Heterogeneous Data Sources Æ Noise, Missing Values, Inconsistency High-Speed DataÆ Fast algorithms, approximate answers Open-Ended Data Æ On-line learning… complex models?
FACIL – The hybrid knowledge model
Pres.
0 122
mm / Hg
25 35 75 90
Ritmo
50 210
puls/min
60 80 130 150
e1 a1 e2
a2 a3
e3 Frontera de Decisión
Primer enemigo recibido Amigo obtenido para el 2º enemigo
60 - 150
25-90
Ejemplos Enemigos Ejemplos Amigos
Sop-Pos: 96 % Sop.Rel.=20% Prec.=98.5% Edad 12 - 20
56 Intervalos
Ritmo
50 210
puls/min
e3
a2 a1
Pres.
0 122
mm / Hg
e1 e1
a3
a1 - a3 e1 - e3 21 - 29 30 - 38 39 - 47 48 - 56
e1 a1 e3 a3 a2 e2 12
Border examples inside rules
FACIL – The Algorithm
Three user parameters
Maximum number of rules / label
Æ necessary to bound the computational complexity
Minimum accuracy (purity) / rule: p/(p+n)
Æ anytime,approximate answers
Maximum growth / dimension
FACIL – IIL updating
IIL: Every example Æ 1/3 cases:
1. Success
2. Anomaly
3. Novelty
General IL Approach
Successive Episodes Æ Updating: Mt= f(Vt,Mt-1) Window of examples Æ Vt: New [and] past examples
Éxito
0.7 0.6 0.1
0.4 0.5 0.7
A
A
1.0 0.2
a A
e a
FACIL – IIL / Success
0.7 0.6 0.1
0.4 0.5 0.7
1.0 0.2
X
a1 e1
a3
a2
e3
e2
a4
FACIL – IIL / Anomaly
0.7 0.6 0.1
0.4 0.5 0.7
1.0 0.2
Anomaly
FACIL – Learning Approach / Case 3
Selection:
The rule implying the minimum growth
FACIL – IIL / Novelty: The Growth Distance
0.7 0.9 0.1
0.1 0.5 0.9
1.0
R
X
1.0
d(R,x) = 0.2+0.4
B C
A 0.1 0.4 0.6
D
X
1.0
R R
d(R,x) = 0.25+0.2
0.7 0.6 0.1
0.4 0.5 0.7
1.0 0.2
0.7 0.6 0.1
0.4 0.5 0.7
R
R
1.0 0.2
Max. Dif. = 15%
G(R
a,x)= 0.2
G(R
b,x)= 0.2
X
FACIL – IIL / Novelty: The Moderate Growth Distance
FACIL – Implicit and Explicit Forgetting Heuristics
reglas en t=16
ejemplos t=35
B
A
A B
reglas en t=34
B
A
e16
B
e16, e29e19, e28
e34
último ejemplo
e28 e19
e16 e16
e34
t=[17,33]
e29
e16 e34
multiple windows
A A
A B
B
A
FACIL – Multi-strategy Classification
?
A
? ?
A
B
B
B
Experiments – General Purpose Classifier Evaluation
FACIL as a UCI Repository
10 Folds Cross Validation, 10 times, t-student (0.05)
Implicit Forgetting Heuristic
Synthetic Data Streams
Change Magnitude = ±0.1 for 40% attributes every 104examples INPUT: 105examples - 5% class noise: 900 training / 100 testing Both Implicit and Explicit Forgetting Heuristics
UCI Repository – Accuracy
50 60 70 80 90 100
Balance Breast C, Glass Heart S, Ionosphere Iris Diabetes Sonar Vehicle Vowel Waveform Wine C4.5Rules FACIL
Numerical Attributes
Num. & Symbolic Atts.
40 50 60 70 80 90 100
Anne al
Auto s
Clev elan
d Cred
it Rati ng
Germ an Cr
edit Hep
atitis Horse
Coli
c Zoo
Audo ilogy
Brea st C
ancer Mush
room Prim
ary T umor
Soyb
ean Vote
C4.5Rules FACIL
UCI Repository – Accuracy
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Balance Breast C, Glass Heart S, Ionosphere Iris Diabetes Sonar Vehicle Vowel Waveform Wine C4.5Rules FACIL
Numerical Attributes
UCI Repository – Number of rules
0 10 20 30 40 50 60 70 80
Anneal Aut
os Clev
eland Cred
it Ra ting
Germ an Cr
edit Hepat
itis Horse
Colic Zoo
Audoi logy
Breas t Ca
ncer Mus
hroom Prim
ary Tum
or Soybean
Vote
C4.5Rules FACIL Num. & Simb. Atts.
UCI Repository – Number of rules
0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45
10 20 30 40 50 60 70 80 90 100
Valores límite para los tres parámetros
En función del máximo número de reglas En función del máximo crecimiento permitido En función del umbral de mínima pureza
Time (seconds)
UCI Repository – Parametric Sensitivity
81 81,5 82 82,5 83 83,5 84 84,5 85 85,5 86 86,5 87
10 20 30 40 50 60 70 80 90 100
Valores límite para los tres parámetros
En función del máximo número de reglas En función del máximo crecimiento permitido En función del umbral de mínima pureza
Accuracy (%)
UCI Repository – Parametric Sensitivity
24 29 34 39 44 49 54 59 64 69 74
10 20 30 40 50 60 70 80 90 100
Valores límite para los tres parámetros
En función del máximo número de reglas En función del máximo crecimiento permitido En función del umbral de mínima pureza
Number of rules
UCI Repository – Parametric Sensitivity
Máx. Growth = 50% Máx. Growth = 75%
Nº Attributes Accuracy Time Nº Rules ≤ 25 Accuracy Time Nº Rules ≤ 25
10 96,36 14 10 98.32 20 7
20 93,25 63 9 94.57 35 12
30 91,04 124 3 89.26 53 10
40 89,7 221 2 89.19 67 7
50 83,88 277 4 87.72 77 10
Moving Hyperplane – Accuracy, time, and number of rules
Î>180 examples / second
Î>3500 examples / second Î>2500 examples / second
Î>650 examples / second
0 50 100 150 200 250 300 350 400 450 500 550 600
10 20 30 40 50
Max Crec. = 50 %, Max Nº Reglas = 50 Max Crec. = 75 %, Max Nº Reglas = 100
Time
Moving Hyperplane – Sensitivity to the number of attributes
73 78 83 88 93 98
1000 10000 20000 30000 40000 50000
Max Crec. = 75 %, Max Nº Reglas = 25 Max Crec. = 100 %, Max Nº Reglas = 10
Accuracy
Moving Hyperplane – Sensitivity to the number of examples
Sensitivity to attribute
Removing / Recovering attributes To adapt the Growth distance:
Changing attribute weights Æ improve the selection of rules
Reordering attributes according influence Æ speed-up the updating process
Processing streams with a variable number of attributes
To evaluate alternative distances between a rule and an example