• No se han encontrado resultados

Steady state

N/A
N/A
Protected

Academic year: 2023

Share "Steady state"

Copied!
13
0
0

Texto completo

(1)

MIT Department of Brain and Cognitive Sciences 9.641J, Spring 2005 - Introduction to Neural Networks Instructor: Professor Sebastian Seung

Backprop for recurrent

networks

(2)

Steady state

• Reward is an explicit function of x.

• The steady state of a recurrent network.

x

i

= f W

ij

x

j

+ b

i

j

 ∑

  

 

max

W ,b

R x ( )

(3)

Recurrent backpropagation

• Find steady state

x = f ( Wx + b )

• Calculate slopes

D = diag { f ( Wx + b ) }

R

−1

W

T

) s =

• Solve for s

( D

x

• Weight update

ΔW = η sx

T

(4)

Sensitivity lemma

R R

= x

W

ij

b

i j

(5)

Input as a function of output

• What input b is required to make x a steady state?

b

i

= f

−1

( x

i

) W x

ij j

j

• This is unique, even when output is not a unique function of the input!

(6)

Jacobian matrix

b

i

= f

−1

( x

i

) W x

ij j

j

b

i

x

= f

−1

′ ( ) δ W

x

j i ij ij

= ( D

−1

W )

ij

(7)

Chain rule

R

= ∑ R b

i

x

j i

b

i

x

j

= ∑ R ( D

−1

W )

b

ij

i i

R

R

= ( D

−1

W

T

)

x b

(8)

Trajectories

• Initialize at x(0)

• Iterate for T time steps

x

i

( ) t = f W

ij

x

j

( t 1 ) + b

i

j

 ∑

  

 

max

W ,b

R x ( ( ) 1 ,K , x T ( ) )

(9)

Unfold time into space

• Multilayer perceptron

– Same number of neurons in each layer – Same weights and biases in each layer

(weight-sharing)

( )

W ,b

( )

W ,b W ,b

( )

x 0 ⎯ ⎯→ x 1 ⎯ ⎯→L ⎯ ⎯→ x T

(10)

Backpropagation through time

• Initial condition x(0)

x ( t ) = f ( Wx ( t 1 ) + b ( t ) )

• Compute

R / ∂ x ( t )

• Final condition s(T+1)=0

R

( ) = D t W

T

s t + 1 ) + D t

s t ( ) ( ( )

x t ( )

Δ W = η ∑ s t ( ) x t ( 1 )

T

Δ b = ηs t ( )

t t

(11)

Input as a function of output

x ( t ) = f ( Wx ( t 1 ) + b ( t ) )

( ) = f

−1

( ( ) ) Wx t 1 )

b t x t (

x ( 1 ) , x ( 2 ) ,K , x ( T 1 ) , x ( T )

b ( 1 ) , b ( 2 ) ,K , b ( T 1 ) , b ( T )

(12)

Jacobian matrix

( ) = f

−1

( x t (

t

b t ( ) ) Wx t 1 )

−1

t

b

i

( )

= δ

tt'

( D ( ) ) W δ

x

j

( ) t '

ij ij t−1,t'

( ) = diag { f Wx t 1 ) + b t

D t ( ( ( ) ) }

(13)

Chain rule

R

x

j

( ) t =

R

b

i

( ) t

b

i

( ) t

x

j

( ) t

i,t

= s

i

( ) t

i

∑ ( D

−1

( ) t )

ij

s

i

( t ' +1 ) W

ij

i

R

x t ( ) = D

−1

( ) t s t ( ) W

T

s t ( +1 )

Referencias

Documento similar

ROMANELLI – OBRADOR ARTESANO DE PASTA FRESCA Fº J, SANJOAQUIN TOLON DOCUMENTO 1: MEMORIA ingeniero técnico agrícola.. Página 6