Introducción al aprendizaje automático
p
j
Aprendizaje
Basic Concepts
p
What is learning?
What is learning?
`
Ability to
f h id ld
` use percepts from the outside world
` not only for reacting,
` but for improving actions in future events.
`
Implies that we know
when
and
how
to use this new
knowledge.
knowledge.
` When: pattern detected
What is machine learning?
What is machine learning?
`Example:
` Imagine a supermarket chain with a hundred of stores selling
groceries to millions of cutomers.
` Each sale has a lot of data that can be analized and converted ` Each sale has a lot of data that can be analized and converted
into information.
` These information can be used to give people suggestions
when buying.
`
If
k
h
ld b it
ld j t it
`If we knew who would buy an item, we would just write
code for the computer to remind them.
`
Because we do not know, we collect data hope to extract
What is ML?
What is ML? …
`
The computer algorithm should be able to:
` Identify patterns in the data (When)
` Construct a good and useful approximation of the solution
What is ML?
What is ML? …
`
“Machine learning is programming computers to optimize a
f
”
performance criterion using example data or past experience.”
Alpaydin, E. (2004)
`
Has a model defined for some parameters.
` Learning is the execution of a computer program to optimize the ` Learning is the execution of a computer program to optimize the
parameters of the model using training data or past experience.
`
Two types of models:
` Predictive model: predictions in the future.
What is ML?
What is ML? …
`
“A computer program is said to
learn
from experience
E
with respect to some class of tasks
T
and performance
measure
P
, if its performance at tasks in
T
, as measured by
P
i
i h
i
E
”
P
, improves with experience
E
.”
Mitchell, T. (1997)`
E
l h d iti
iti
`Example: handwriting recognition
` Task T: recognizing and classifying handwritten words within
images images.
` Performance measure P: percent of words correctly classified ` Training experience E: a database of handwritten words with ` Training experience E: a database of handwritten words with
Design of a learning element
Design of a learning element
1.
Which
components
of the performance element
should be learned?
2.
What
feedback
is available to learn these components?
Components of the performance element
Components of the performance element
`
What can be learned?
` Direct mapping from conditions on the current state to actions ` Direct mapping from conditions on the current state to actions.
` Means to infer relevant properties of the world from the percept
sequence.
` Information about the way world evolves and the results of possible
k actions agent can take.
` Utility information indicating the desirability of world states.y g y
` Action-value information indicating desirability of actions.
` Goals that describe classes of states whose achievement maximizes
Feedback
Feedback
`
Components can be learned from appropiate feedback.
` Example: training Tae Kwon Do, Driving a Taxi.
`
Type of feedback:
` The most important factor in determining the nature of the
learning problem.
Th
` Three cases:
1. Supervised learning
2 Unsupervised learning 2. Unsupervised learning
Supervised Learning
Supervised Learning
`
Learning a function from examples of its inputs and outputs.
Th i i X Y d h k i l i
` There is an input X, an output Y, and the task is to learn mapping
from input to output.
`
Outputs values can be provided
` By a supervisor – someone feed the output. ` By the environment – detected by sensors.
`
E
l
`
Examples:
` Learn a condition-action rule for punching. ` Learn to differentiate between a dog and a cat. ` Learn to differentiate between a dog and a cat. ` Regression
Unsupervised learning
Unsupervised learning
`
Learning patterns in the input when no specific output
values are supplied.
`
Aim: to find regularities in the input.
`
Example:
` Learn when it might rain. ` Learn when it might rain.
Reinforcement learning
Reinforcement learning
`
The output of the system is a sequence of actions.
`
These actions are part of a
policy
.
` A single action is not important. g p ` The policy is what must be learned.
A
l
f
i f
hi h
i
b
`
Agent must learn from reinforcement which actions are best,
i.e. the policy.
`
Examples:
Representation of the learned information
Representation of the learned information
`Polynomials
`
Propositional logic
`
Predicate calculus
`
Bayesian networks
`
Neural networks
Applications of machine learning
Applications of machine learning
` Learning associations
` Learn how people associate elements (ex buying groceries) ` Learn how people associate elements (ex. buying groceries)
` Classification
L l f l d ff
` Learn to classify elements in different categories
` Prediction
` Learn to predict if some action will happen
` Pattern recognition ` Pattern recognition
` Learn to find familiar patterns (characters, faces, objects, etc.)
K l d i
` Knowledge extraction
Applications of machine learning
Applications of machine learning …
`Outlier detection
` Data that does not belong to a class
`
Regression problems
ML is multidisciplinary
ML is multidisciplinary
` Artificial Intelligence
` Bayesian methods
C i l l i h
` Computational complexity theory
` Control theory
` Information theory
` Philosophy
P h l d bi l
` Psychology and neurobiology
Designing a learning system
g
g
g y
Designing a learning system
Designing a learning system
1.
Choosing the training experience
2.
Choosing the target function
g
g
3
Choosing a representation for the target function
3.Choosing a representation for the target function
Choosing the training experience
Choosing the training experience …
`How is the feedback?
` Direct: provide a state and its correct solution.
` Feedback is clear and direct.
` Indirect: provide a sequence of states and the final outcome.
` Correctness must be inferred ` Correctness must be inferred.
` Credit assignment problem: degree to which each state deserves
Choosing the training experience
Choosing the training experience …
`
How is the control of the sequence of training examples?
` Selection of a state is made by a supervisor.
` Selection of a state is made by the learner and a solution is
provided.
` Selection of a sequence of states is made by the learner and at
Choosing the training experience
Choosing the training experience …
`
How well is the distribution of examples over which the
performance must be measured?
` Most reliable: training examples follow a distribution similar to
h f f l
that of future test examples.
` N t l f di t ib ti l h t ` Necessary to learn from distribution examples somewhat
different from those present in the final evaluation.
` Most work in ML relies in the assumption that the distribution
of training examples is identical to that of test examples.g p p
Choosing the training experience
Choosing the training experience …
`
Example: learn to play checkers in a world tournament
` Task T:
` playing checkers
` Performance measure P:
f
` percent of games won
` Training experience E: ` Training experience E:
` games played against itself
Indirect feedback.
Choosing the target function
Choosing the target function …
`
What type of knowledge will be learned and how this will
be used?
` Key design choice.
` Reduce the problem of improving performance P at task T to
the problem of learning a target function.
` Example: given a generator of legal moves select the best
move move.
` Target function:
B = board state
M
B
ChooseMove
:
→
M = set of legal moves
Choosing the target function
Choosing the target function …
` Example …
` Target function:
B = board state Set of real numbers.
ℜ →
B V :
Set of real numbers.
Higher scores to better states.
V(b) d fi d
V(b) defined as:
1. If b is a final state that won then V(b) = 100 2. If b is a final state that lost then V(b) = -100
If b i fi l h d h V(b) 0
3. If b is a final state that drawn then V(b) = 0 4. If b is NOT a final state then V(b) = V(b’)
` b’ is the best FINAL state starting from b and played optimally
Case 4 is nonoperational as it is not efficiently computable
Choosing the target function
Choosing the target function …
`
Goal: discover an
operational description of the ideal target
function V
.
`
Very difficult to learn
V
.
`
Learning algorithms acquire an approximation of
V
, i.e.
∧
Choosing a representation for the target
function
Choosing a representation for the target
function
function …
`
The characteristics of the learning problem will
determine the representation
` Table with values ` Collection of rules
` Neural network
` Polynomial function
`
Ideal: very expressive representation.
`
Drawback: the more expressive, the more training data
Choosing a representation for the target
function
function …
`Example:
` x1: number of black pieces on the board ` x2: number of red pieces on the board ` x : number of black kings on the board ` x3: number of black kings on the board ` x4: number of red kings on the board
` x5: number of black pieces threatened by red ` x5: number of black pieces threatened by red ` x6: number of red pieces threatened by black
)
(
^
b
V
= w
0+ w
1x
1+ w
2x
2+ w
3x
3+ w
4x
4+ w
5x
5+ w
6x
6Choosing a function approximation
algorithm
Choosing a function approximation
algorithm
algorithm …
1.
Estimating training values
∧
` To learn a set of training examples is required
` Each example must have: [b V (b)] V
` Each example must have: [b, Vtrain(b)]
` State b
` Training value g Vtraintrain(b)( )
` Example of a training example:
[[X 0 X 5 X 0 X 1 X 0 X 0] 100]
` [[X1 = 0, X2 = 5, X3 = 0, X4 = 1, X5 = 0, X6 = 0], 100]
` Training values g
` For end states are easy to assign.
Choosing a function approximation
algorithm
algorithm …
` A final outcome does not give information on the quality of
i di
intermediate states.
` A f l l f ti ti t i i l i ` A very successful rule for estimating training values is:
))
(
(
)
(
b
V
Succesor
b
V
train∧
←
` Very accurate near final states
))
(
(
)
(
b
V
Succesor
b
V
train←
` Very accurate near final states
Choosing a function approximation
algorithm
algorithm …
2.
Adjusting the weights
` Learning algorithm must learn the weights wi that best fit
the training examples.
` A common solution for the best fit is:
2
( ) ( )
( ) ∈∑
∧ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ≡ examples training b V b train train b V b V E _ , 2` E is the error between training values and values predicted
` So, we find the weights that minimize E for the training
Choosing a function approximation
algorithm
algorithm …
` LMS (least mean square) algorithm
` Incrementally refines weights ` Robust to errors
` For each observed training example it adjusts the weights a small ` For each observed training example it adjusts the weights a small
amount in the direction that reduces the error.
` For each training example: [b, Vtrain(b)]
Use the current weights to calculate
V
( )
b
∧
Use the current weights to calculate
For each weight wi, update it as
( )
b
V
( ) ( )
itrain i
i
w
V
b
V
b
x
w
⎟
⎠
⎞
⎜
⎝
⎛
−
+
←
η
∧small constant – moderates size of weight update
⎠
⎝
Designing a learning system summary
Designing a learning system - summary
1.
Choosing the training experience
1 Feedback1. Feedback
2. Control of sequence of examples 3. Distribution of examples
2.
Choosing the target function
1 Function that is operational 1. Function that is operational3.
Choosing a representation for the target function
g
p
g
1. Expressive representation4
Ch
i f
ti
i
ti l
ith
4.Choosing a function approximation algorithm
Final design
Final design
ExperimentGenerator
New Problem Hypothesis^
(initial game board) V
Generalizer
Performance Generalizer
System
Solution trace
Critic (game history)
Training Examples
Determine Type of Training Experience Training Experience
Games against experts
Games against self Table of correct moves
…
Determine Target Function
Table of correct moves
Board →value
…
Determine Representation of Learned Function
Polynomial Board →move
Determine Learning Algorithm
Polynomial
Linear function of six features
Artificial neural network
…
Algorithm
Li i
…
Issues in machine learning
Issues in machine learning
` What algorithms exist for learning general target functions from
specific training examples? specific training examples?
` How much training data is sufficient?
` When and how can prior knowledge guide the process of
generalizing from examples?
g g p
` What is the best strategy for choosing a useful training experience?
` What is the best way to reduce the learning task to one or more
function approximation problems?
Exercises
Exercises
The following exercises must be
h d d i h d f h l They will be graded as part of h “A i id d A lí i ”
handed in at the end of the class. the “Actividades Analíticas”.
For each exercise define:
1. Choosing the training experience
` In particular remember to define:
` Task T
g g p
1. Feedback
2. Control of sequence of examples 3. Distribution of examples
` Task T
` Performance measure P ` Training experience E
2. Choosing the target function 1. Function that is operational
` Target Function
3. Choosing a representation for the target
function
1. Expressive representation
` Representation for the target
function
4. Choosing a function approximation algorithm 1. Estimating training values
Exercises
Exercises
`Tic-tac-toe
` Game for two players, O
and X, who take turns
ki th i 3×3 marking the spaces in a 3×3 grid, usually X going first.
` The player who succeeds in
placing three respective
p g p
marks in a horizontal, vertical or diagonal row
i th
Exercises
Exercises …
`
Reversi
(
Othello
)
` Involves play by two parties
on an eight-by-eight square grid with pieces that have
g p
two distinct sides.
` The goal for each player is
to make pieces of their
Exercises
Exercises …
`
Connect Four
(also known
as
Plot Four
Four in a
as
Plot Four
,
Four in a
Row
, and
Four in a Line
)
` Two-player game
` Players take turns in dropping
alternating colored discs into a se en c l mn si r
a seven-column, six-row vertically-suspended grid.
Th bj f h i
` The object of the game is to