• No se han encontrado resultados

2.5 Marco Legal

2.5.3 Estrategias Del Desarrollo Como Ayuda De Manera Global

Some intuition into the method to derive the simple curve basis is gained by studying its relationship to PCA. PCA orders orthogonal directions by the amount of variation ex- plained. The directions are found by performing an eigendecompostion of the covariance matrix, ˆΣPN. The matrix ˆΣPN summarizes the covariance structure of the subspace. For

the simple curve basis analysis we are ordering orthogonal directions by their simplicity scores. The directions are found by performing an eigendecompostion on the simplicity matrix, ˆFPN. The matrix ˆFPN summarizes the simplicity structure of the subspace.

Before describing the methods to finding the PCA basis and simple curve basis of a subspace, we first have to define the covariance matrix and simplicity matrix of the full space because the methods are highly dependent upon these. The covariance matrix of the full space is the empirical covariance matrix of the data

ˆ

Σ = 1

n−1(X−X)(X−X),

where X is a d×n data matrix and X is a d×n matrix with each column being the mean of the rows of X.

The simplicity matrix of the full space is

ˆ

Ff ull = 4Id−DDT,

where D is the difference matrix introduced in Section 3.2 and Id is the identity matrix of size d× d. Some intuition for why Ff ull has this from is gained by extending the definition of the simplicity score for multiple directions of a basis.

is a basis of the full space. This leaves the simplicity measure for multiple directions of the full space as DDT. But for interpretability purposes we would like simple directions to be associated with high simplicity score. The analogous calculation of subtractingmβ from 4 is to subtractDDT from 4I

d. Therefore these steps produce the simplicity matrix ˆ

Ff ull which is analogous to the simplicity score for multiple directions.

In order to find the PCA basis of the full space an eigendecompostion of ˆΣ is per- formed. In order to find the simple curve basis of the full space an eigendecompostion of ˆFf ull is performed. But we would also like to find the PCA basis and simple curve basis of a subspace as well. The PCA basis and simple curve basis are found by an eigendecompostion of ˆ ΣPN = ˆPNΣ ˆˆPN and ˆ FPN = ˆPNF f ullPˆ N respectively.

Insight into the form of ˆΣPN is gained by thinking of PCA of projected data. Let

PN be the projection matrix of the nearly null space. The centered data, i.e. X −X, projected onto the nearly null space is then PN(X−X). PCA of the nearly null space is then PCA using this projected data. The covariance matrix of the projected data is

ΣPN =

1

n−1PN(X−X)[PN(X−X)]

T =P

NΣˆPN,

which is the covariance matrix of the full space pre and post multiplied by the projection matrix of the nearly null space. Since the nearly null space is not usually known the PCA basis of the estimated nearly null space is found. To do thisPN is replaced by the projec- tion matrix of the estimated nearly null space. This implies that an eigendecomposition of

ˆ

yields the PCA basis of the estimated nearly null space. The PCA basis of the estimated nearly null space is the eigendirections of ˆΣPN which correspond to the dN smallest

eigenvalues, wheredN is the dimension of ˆPN.

To find the simple curve basis an eigendecompostion of

FPN =PNF

f ull

PN,

which is the simplicity matrix of the full space pre and post multiplied by the projection matrix of the nearly null space. To find the simple curve basis of the estimated nearly null space the eigendecompostion of

ˆ

FPN = ˆPNF

f ullPˆ N

is performed. The simple curve basis of the estimated nearly null space is the eigendirec- tions of ˆFPN which correspond to the dN largest eigenvalues, wheredN is the dimension

of ˆPN. For a more mathematical derivation of FPN see Section 3.3.2.

A further investigation of ˆΣPˆN and ˆF shows an interesting property of the simplicity matrix for different samples. If the same nearly null space is estimated for multiple samples, then ˆPN is the same for each of the samples. This implies that ˆF is the exact same for those samples, since Ff ull is the same for every sample. Ff ull is the same for every sample because the projection matrix of the full space is always Id. Thus the simple curve basis is the same. But for PCA, the matrix ˆΣPˆN is not necessarily the same, because ˆΣ could be different for each sample. Thus the PCA basis of the nearly null space could be different for each sample, even though the estimated nearly null space is the same.

3.3.2

Mathematical Derivation of

F

PN

How to derive the simple curve basis of the nearly null space is defined in Section 3.3 using its relation to PCA for an intuitive understanding of the procedure. This Section provides a more mathematical derivation of the method.

To derive the simplicity matrix a calculation similar to Equation 3.1 is performed. Ideally we would wish to replaceβby the basis matrix B. But more careful consideration must be taken when subtracting the simplicity score from 4. Before the simplicity matrix is defined another characterization of sβ is needed. In order for the characterization to be given, first note that

sβ = 4−(βTD)(βTD)T = 4−βTDDTβ.

Next we would like to replaceDDT by another matrix which will produce the simplicity score, with out having to subtract from 4. This is done by using the characterization

sβ = 4−βTDDTβ =βT(4Id−DDT)β,

where Id is the identity matrix of size d×d.

Based on this characterization of sβ, the simplicity matrix can be defined by simply replacing β by the basis matrix B. Therefore the simplicity matrix is

FB=BT(4Id−DDT)B,

whereB = [b1, b2, . . . , bdN] is any d×dN basis matrix. Notice that along the diagonal are

the simplicity scores of each direction of the basis. Also notice that this matrix isdN×dN. Therefore an eigendecompostion of this matrix leads to eigenvectors of size dN ×1. The eigenvector which corresponds to the largest eigenvalue, is the linear combination of the basis directions with the highest simplicity score. This result is best understood in the

object space, but a vector of sized×1 is needed. This vector is found by post multiplying

B by this eigenvector of FB. The eigenvector which corresponds to the second largest eigenvalue, is the linear combination of the basis directions orthogonal to the first with the next highest simplicity score, etc.

The eigendecompostion of thisFBleads to the results of ordering orthogonal directions of the nearly null space by their simplicity score. But the basis B must be multiplied by the eigenvectors of FB to get interpretable results. But the results of this come directly from an eigendecomposition of FB pre-multiplied by B and post-multiplied byBT, i.e.

FPN =BFdNB

T =BBT(4I

d−DDT)BBT =PN(4Id−DDT)PN,

wherePN is the projection matrix onto the nearly null space. The simplest direction of the nearly null space is the eigenvector ofF which corresponds to the largest eigenvalue. The direction orthogonal to the first which is next simplest is the eigenvector corresponding to the second largest eigenvalue of FPN, etc. The matrix FPN has dN eigenvalues larger

than 0. The simple curve basis is the eigenvectors which correspond to eigenvalues larger than 0.

Since the nearly null space is not usually known, the analysis will consist of an eigen- decomposition of an estimate ofFPN. The matrix FPN is estimated by

ˆ

FPN = ˆPN(4Id−DD

T) ˆP N,

where ˆPN is the projection matrix of the estimated nearly null space, ˆSN.

Also notice that Fβ is dependent upon the basis, but FPN depends on the projection

matrix. Therefore if different bases are used to define a subspace thenFβ will be different for each basis. But PN is always the same for a subspace so therefore FPN is always the

same. It will then have mathematical advantages to use FPN as the definition of the

3.4

Simple Curve Basis for Unevenly Spaced Envi-

Documento similar