• No se han encontrado resultados

Principal component analysis

N/A
N/A
Protected

Academic year: 2023

Share "Principal component analysis"

Copied!
25
0
0

Texto completo

(1)

Principal component analysis

MIT Department of Brain and Cognitive Sciences 9.641J, Spring 2005 - Introduction to Neural Networks Instructor: Professor Sebastian Seung

(2)

Hypothesis: Hebbian synaptic plasticity enables perceptrons

to perform principal

component analysis

(3)

Outline

• Variance and covariance

• Principal components

• Maximizing parallel variance

• Minimizing perpendicular variance

• Neural implementation

– covariance rule

(4)

Principal component

• direction of maximum variance in the input space

• principal eigenvector of the covariance matrix

• goal: relate these two definitions

(5)

Variance

• A random variable fluctuating about its mean value.

• Average of the square of the fluctuations.

δ x = xx

δ x

( )

2

= x

2

x

2

(6)

Covariance

• Pair of random variables, each fluctuating about its mean value.

• Average of product of fluctuations.

δ x

1

= x

1

x

1

δ x

2

= x

2

x

2

δ x

1

δ x

2

= x

1

x

2

x

1

x

2

(7)

Covariance examples

(8)

Covariance matrix

N random variables

NxN symmetric matrix

• Diagonal elements are variances

C

ij

= x

i

x

j

x

i

x

j

(9)

Principal components

• eigenvectors with k largest eigenvalues

• Now you can calculate them, but what do they mean?

(10)

Handwritten digits

(11)

Principal component in 2d

(12)

One-dimensional projection

(13)

Covariance to variance

• From the covariance, the variance of any projection can be calculated.

• Let w be a unit vector

w

T

x

( )

2

w

T

x

2

= w

T

Cw

= w

i

C

ij

w

j

ij

(14)

Maximizing parallel variance

• Principal eigenvector of C

– the one with the largest eigenvalue.

w

*

= arg max

w:w =1

w

T

Cw

λ

max

( ) C = max

w:w =1

w

T

Cw

= w

*T

Cw

*

(15)

Orthogonal decomposition

(16)

Total variance is conserved

• Maximizing parallel variance =

Minimizing perpendicular variance

(17)

Rubber band computer

(18)

Correlation/covariance rule

• presynaptic activity x

• postsynaptic activity y

• Hebbian

w = η yx

w = η ( yy ) ( x x )

(19)

Stochastic gradient ascent

• Assume data has zero mean

• Linear perceptron

y = w

T

xw = η xx

T

w E = 1

2 w

T

Cw C = xx

T

w = η∂ E

w

(20)

Preventing divergence

• Bound constraints

(21)

Oja’s rule

• Converges to principal component

• Normalized to unit vector

w = η ( yxy

2

w )

(22)

Multiple principal components

• Project out principal component

• Find principal component of remaining variance

(23)

clustering vs. PCA

• Hebb: output x input

• binary output

– first-order statistics

• linear output

– second-order statistics

(24)

Data vectors

xa means ath data vector

ath column of the matrix X.

Xia means matrix element Xia

ith component of xa

x is a generic data vector

xi means ith component of x

(25)

Correlation matrix

• Generalization of second moment

x

i

x

j

= 1

m X

ia

a=1

m

X

ja

xx

T

= 1

m XX

T

Referencias

Documento similar

Para explicar los contornos más salientes que caracterizan a la modalidad del servicio prepago de telefonía celular, resulta ineludible referirse primero al proceso