Tema 1 - Introduccion al aprendizaje automatico - ML.pdf

(1)

Introducción al aprendizaje automático

p

j

Aprendizaje

(2)

Basic Concepts

p

(3)

What is learning?

`

Ability to

f h id ld

` use percepts from the outside world

` not only for reacting,

` but for improving actions in future events.

`

Implies that we know

when

and

how

to use this new

knowledge.

` When: pattern detected

(4)

What is machine learning?

`

Example:

` Imagine a supermarket chain with a hundred of stores selling

groceries to millions of cutomers.

` Each sale has a lot of data that can be analized and converted ` Each sale has a lot of data that can be analized and converted

into information.

` These information can be used to give people suggestions

when buying.

`

If

k

h

ld b it

ld j t it

`

If we knew who would buy an item, we would just write

code for the computer to remind them.

`

Because we do not know, we collect data hope to extract

(5)

What is ML?

What is ML? …

`

The computer algorithm should be able to:

` Identify patterns in the data (When)

` Construct a good and useful approximation of the solution

(6)

What is ML?

What is ML? …

`

“Machine learning is programming computers to optimize a

f

”

performance criterion using example data or past experience.”

Alpaydin, E. (2004)

`

Has a model defined for some parameters.

` Learning is the execution of a computer program to optimize the ` Learning is the execution of a computer program to optimize the

parameters of the model using training data or past experience.

`

Two types of models:

` Predictive model: predictions in the future.

(7)

What is ML?

What is ML? …

`

“A computer program is said to

learn

from experience

E

with respect to some class of tasks

T

and performance

measure

P

, if its performance at tasks in

T

, as measured by

P

i

i h

i

E

”

P

, improves with experience

E

.”

_{Mitchell, T. (1997)}

`

E

l h d iti

iti

`

Example: handwriting recognition

` Task T: recognizing and classifying handwritten words within

images images.

` Performance measure P: percent of words correctly classified ` Training experience E: a database of handwritten words with ` Training experience E: a database of handwritten words with

(8)

(9)

Design of a learning element

1.

Which

components

of the performance element

should be learned?

2.

What

feedback

is available to learn these components?

(10)

Components of the performance element

`

What can be learned?

` Direct mapping from conditions on the current state to actions ` Direct mapping from conditions on the current state to actions.

` Means to infer relevant properties of the world from the percept

sequence.

` Information about the way world evolves and the results of possible

k actions agent can take.

` Utility information indicating the desirability of world states.y g y

` Action-value information indicating desirability of actions.

` Goals that describe classes of states whose achievement maximizes

(11)

Feedback

`

Components can be learned from appropiate feedback.

` Example: training Tae Kwon Do, Driving a Taxi.

`

Type of feedback:

` The most important factor in determining the nature of the

learning problem.

Th

` Three cases:

1. Supervised learning

2 Unsupervised learning 2. Unsupervised learning

(12)

Supervised Learning

`

Learning a function from examples of its inputs and outputs.

Th i i X Y d h k i l i

` There is an input X, an output Y, and the task is to learn mapping

from input to output.

`

Outputs values can be provided

` By a supervisor – someone feed the output. ` By the environment – detected by sensors.

`

E

policy

.

` A single action is not important. g p ` The policy is what must be learned.

A

l

f

i f

hi h

i

b

`

Agent must learn from reinforcement which actions are best,

L l f l d ff

` Learn to classify elements in different categories

` Prediction

` Learn to predict if some action will happen

` Pattern recognition ` Pattern recognition

` Learn to find familiar patterns (characters, faces, objects, etc.)

K l d i

` Knowledge extraction

(17)

Applications of machine learning

Applications of machine learning …

`

Outlier detection

` Data that does not belong to a class

`

Regression problems

(18)

ML is multidisciplinary

` Artificial Intelligence

` Bayesian methods

C i l l i h

` Computational complexity theory

` Control theory

` Information theory

` Philosophy

P h l d bi l

` Psychology and neurobiology

(19)

Designing a learning system

g

g y

(20)

Designing a learning system

1.

Choosing the training experience

2.

Choosing the target function

g

3

Choosing a representation for the target function

3.

Choosing a representation for the target function

(21)

(22)

Choosing the training experience

Choosing the training experience …

`

How is the feedback?

` Direct: provide a state and its correct solution.

` Feedback is clear and direct.

` Indirect: provide a sequence of states and the final outcome.

` Correctness must be inferred ` Correctness must be inferred.

` Credit assignment problem: degree to which each state deserves

(23)

Choosing the training experience

Choosing the training experience …

`

How is the control of the sequence of training examples?

` Selection of a state is made by a supervisor.

` Selection of a state is made by the learner and a solution is

provided.

` Selection of a sequence of states is made by the learner and at

(24)

Choosing the training experience

Choosing the training experience …

`

How well is the distribution of examples over which the

performance must be measured?

` Most reliable: training examples follow a distribution similar to

h f f l

that of future test examples.

` N t l f di t ib ti l h t ` Necessary to learn from distribution examples somewhat

different from those present in the final evaluation.

` Most work in ML relies in the assumption that the distribution

of training examples is identical to that of test examples.g p p

(25)

Choosing the training experience

Choosing the training experience …

`

Example: learn to play checkers in a world tournament

` Task T:

` playing checkers

` Performance measure P:

f

` percent of games won

` Training experience E: ` Training experience E:

` games played against itself

Indirect feedback.

(26)

(27)

Choosing the target function

Choosing the target function …

`

What type of knowledge will be learned and how this will

be used?

` Key design choice.

` Reduce the problem of improving performance P at task T to

the problem of learning a target function.

` Example: given a generator of legal moves select the best

move move.

` Target function:

B = board state

M

B

ChooseMove

:

→

M = set of legal moves

(28)

Choosing the target function

Choosing the target function …

` Example …

` Target function:

B = board state Set of real numbers.

ℜ →

B V :

Set of real numbers.

Higher scores to better states.

V(b) d fi d

V(b) defined as:

1. If b is a final state that won then V(b) = 100 2. If b is a final state that lost then V(b) = -100

If b i fi l h d h V(b) 0

3. If b is a final state that drawn then V(b) = 0 4. If b is NOT a final state then V(b) = V(b’)

` b’ is the best FINAL state starting from b and played optimally

Case 4 is nonoperational as it is not efficiently computable

(29)

Choosing the target function

Choosing the target function …

`

Goal: discover an

` Polynomial function

`

Ideal: very expressive representation.

`

Drawback: the more expressive, the more training data

(32)

Choosing a representation for the target

function

function …

`

Example:

` x₁: number of black pieces on the board ` x₂: number of red pieces on the board ` x : number of black kings on the board ` x₃: number of black kings on the board ` x₄: number of red kings on the board

` x₅: number of black pieces threatened by red ` x₅: number of black pieces threatened by red ` x₆: number of red pieces threatened by black

)

(33)

Choosing a function approximation

algorithm

(34)

Choosing a function approximation

algorithm

algorithm …

1.

Estimating training values

∧

` To learn a set of training examples is required

` Each example must have: [b V (b)] V

` Each example must have: [b, V_train(b)]

` State b

` Training value g V_train_train(b)( )

` Example of a training example:

[[X 0 X 5 X 0 X 1 X 0 X 0] 100]

` [[X₁ = 0, X₂ = 5, X₃ = 0, X₄ = 1, X₅ = 0, X₆ = 0], 100]

` Training values g

` For end states are easy to assign.

(35)

Choosing a function approximation

algorithm

algorithm …

` A final outcome does not give information on the quality of

i di

intermediate states.

` A f l l f ti ti t i i l i ` A very successful rule for estimating training values is:

))

(

)

(

b

V

Succesor

b

V

_train

∧

←

` Very accurate near final states

))

(

)

(

b

V

Succesor

b

V

_train

←

` Very accurate near final states

(36)

Choosing a function approximation

algorithm

algorithm …

2.

Adjusting the weights

` Learning algorithm must learn the weights w_i that best fit

the training examples.

` A common solution for the best fit is:

2

( ) ( )

( ) ∈

∑

∧ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ₋ ≡ examples training b V b train train b V b V E _ , 2

` E is the error between training values and values predicted

` So, we find the weights that minimize E for the training

(37)

Choosing a function approximation

algorithm

algorithm …

` LMS (least mean square) algorithm

` Incrementally refines weights ` Robust to errors

` For each observed training example it adjusts the weights a small ` For each observed training example it adjusts the weights a small

amount in the direction that reduces the error.

` For each training example: [b, V_train(b)]

Use the current weights to calculate

V

( )

b

∧

Use the current weights to calculate

For each weight w_i, update it as

( )

b

V

( ) ( )

_i

train i

i

w

V

b

V

b

x

w

⎟

⎠

⎞

⎜

⎝

⎛

₋

+

←

η

∧

small constant – moderates size of weight update

⎠

⎝

(38)

Designing a learning system summary

Designing a learning system - summary

1.

Choosing the training experience

1 Feedback

1. Feedback

2. Control of sequence of examples 3. Distribution of examples

2.

Choosing the target function

1 Function that is operational 1. Function that is operational

3.

Choosing a representation for the target function

g

p

g

1. Expressive representation

4

Ch

i f

ti

i

ti l

ith

4.

Choosing a function approximation algorithm

(39)

Final design

Experiment

Generator

New Problem Hypothesis_^

(initial game board) V

Generalizer

Performance _Generalizer

System

Solution trace

Critic (game history)

Training Examples

(40)

Determine Type of Training Experience Training Experience

Games against experts

Games against self _{Table of correct moves}

…

Determine Target Function

Table of correct moves

Board →value

…

Determine Representation of Learned Function

Polynomial Board →move

Determine Learning Algorithm

Polynomial

Linear function of six features

Artificial neural network

…

Algorithm

Li i

…

(41)

Issues in machine learning

` What algorithms exist for learning general target functions from

specific training examples? specific training examples?

` How much training data is sufficient?

` When and how can prior knowledge guide the process of

generalizing from examples?

g g p

` What is the best strategy for choosing a useful training experience?

` What is the best way to reduce the learning task to one or more

function approximation problems?

(42)

Exercises

(43)

Exercises

The following exercises must be

h d d i h d f h l They will be graded as part of h “A i id d A lí i ”

handed in at the end of the class. the “Actividades Analíticas”.

For each exercise define:

1. Choosing the training experience

` In particular remember to define:

` Task T

g g p

1. Feedback

2. Control of sequence of examples 3. Distribution of examples

` Task T

` Performance measure P ` Training experience E

2. Choosing the target function 1. Function that is operational

` Target Function

3. Choosing a representation for the target

function

1. Expressive representation

` Representation for the target

function

4. Choosing a function approximation algorithm 1. Estimating training values

(44)

Exercises

`

Tic-tac-toe

` Game for two players, O

and X, who take turns

ki th i 3×3 marking the spaces in a 3×3 grid, usually X going first.

` The player who succeeds in

placing three respective

p g p

marks in a horizontal, vertical or diagonal row

i th

(45)

Exercises

Exercises …

`

Reversi

(

Othello

)

` Involves play by two parties

on an eight-by-eight square grid with pieces that have

g p

two distinct sides.

` The goal for each player is

to make pieces of their

(46)

Exercises

Exercises …

`

Connect Four

(also known

as

Plot Four

Four in a

as

Plot Four

,

Four in a

Row

, and

Four in a Line

)

` Two-player game

` Players take turns in dropping

alternating colored discs into a se en c l mn si r

a seven-column, six-row vertically-suspended grid.

Th bj f h i

` The object of the game is to