CAPÍTOL 1. CONTEXTUALITZACIÓ
1.2. El projecte INeDITHOS
1.2.2. Intervencions realitzades
1.2.2.1. Intervencions amb en R
EANAT objective uses a hybrid decomposition schemes which borrow ideas from SPCA and ICA and further augments it with neuroanatomically specific penalty terms. EANAT seeks to represent each component image with a set of neuroanatom- ical coordinates that areconnected,smoothand are defined bynon-negativeweights. Although this latter constraint can be relaxed, non-negativity improves our ability to interpret data by preventing weights from being both positive and negative within the same eigenanatomy component. Also, non-negativity means that the projections of eigenanatomy into subject space are simple weighted averages of the input data (e.g. cortical thickness values) for each subject.
arg min
U,V
kX−UV>k22+λ1S(U) +λ2S(V), (5.6)
where λ1 controls the contribution of the ICA (sparse coding) sparsity component of the penalty andλ2 controls the sparsity of the SPCA component of the penalty.
S(V) = k X i=1 kG×vik1, (5.7) ∀i6=jhvj,vii= 0 , vi 0, kS(v)k1 =γ
Where,γ is a user defined sparsity parameter which controls the number of non-zero entries in the solution. Gis a kernel matrix which enforces smoothness and connect- edness among the different EANAT components along the anatomical manifold and is similar in spirit to the wavelet or discrete cosine basis transform (Becker et al. 2011). When the G operator is equal toI (identity matrix), it reduces to a simple
`1 penalty.
Its worth noting a further few points about the objective:
• We enforce orthogonality between the various components. In other words,∀
1≤i, j≤p,ui ⊥uj and vi ⊥vj but unlike standard PCA, ui 6=vi·x.
• Non-negativity of the components means that the projections of eigenanatomy into subject space are simply weighted averages of the input data (e.g. cortical thickness values) for each subject. Although this constraint can be relaxed, non-negativity improves our ability to interpret data by preventing weights from being both positive and negative within the same eigenanatomy compo- nent. As such, one may compute effect sizes and interpret statistics directly, for example, “reductions in posterior cingulate cortical thickness reduce per- formance on memory-related psychometrics.”
To the best of our knowledge, directly exploring the interaction between sparse- ness, orthogonality and non-negativity for automated parcellation of the brain is novel. This penalty set gives us anatomically reasonable results as we show in the
experiments section.
5.2.1 Optimization
There are a variety of ways that one could optimize the above objective. (Mairal et al. 2010) formulate a convex alternative for the above objective which uses an elastic net type penalty on V. However, we propose an alternating optimization approach, also called an analysis-synthesis loop (Murphy 2012). As a broader tem- plate, we optimize U keeping V fixed. Next, we deflate the X matrix using the optimizedUs, and then optimizeVwithU fixed. This alternating procedure is re- peated till convergence. Each of our sparse optimization forU and V is performed via iterative soft-thresholding on the conjugate gradient of the Rayleigh Quotient.
Iterative soft-thresholding (soft(a, δ) , sign(a)(kak −δ)+ with x+ =max(x,0)) falls in the class of proximal gradient methods and has been shown to have bet- ter convergence (Bredies and Lorenz 2008) and scalability properties compared to other sparse optimization algorithms e.g. Least Angle Regression (LARS) (Yang et al. 2010). Furthermore, deflation has been shown to give better sparse PCA solutions (Mackey 2008); so adding a deflation step between the alternating opti- mizations helps us get better solutions.
5.2.2 Implementation Details
We know that the best rank ‘k’ reconstruction of a matrix i.e. argminXˆ kX−Xˆk2F, is
provided by its first ‘k’ eigenvectors (Eckart and Young 1936) i.e. ˆX=Pk
i=1dkukvk>.
Hence, the best rank-1 approximation of X, i.e. the n×1 and p×1 vectors ˜u, ˜
v such that,
˜
u∗,v˜∗=min˜u,v˜kX−u˜˜v>k2F (5.8)
first left and right eigenvectors and the eigenvalue, respectively, of theX matrix. Proceeding this way, d2u2v2> provide the best rank-1 approximation of the “de- flated” matrix X−d1u1v1> and so on.
As pointed by (Shen and Huang 2008a), with ˜v fixed, the above optimization over ˜uis equivalent to a least squares regression ofXon ˜v. However, in our case, we have sparsity on ˜ualso, so it becomes a sparse optimization problem (Equation 6.4). Similarly, with ˜u fixed, the optimization over ˜v is also a sparse optimization problem (Equation 5.10). As mentioned in the last section, we solve both these by iterative soft thresholding on the conjugate gradient of Rayleigh Quotient.
As described earlier, our implementation alternates between optimization of Equations 6.4, 5.10 (shown below for iteration number ‘m’) till convergence .
U∗m= argmin U,kUk=1,u>i uj=0,i6=j (X−UV>m−1)2+λ1kGUk1 (5.9) {vi∗}m = argmin vi,kvik=1,vi>vj=0,i6=j (X\i−Umvi>)2+λkGvik1 (5.10) whereX\i ,X− Pk j=1,j6=iu˜jv˜ >
j is the “deflated” Xmatrix.
The sparseness, smoothness and non-negativity are enforced as discussed in the previous section.
The details of our algorithm can be found in Algorithms 6, 7.