• No se han encontrado resultados

III MATERIALES Y MÉTODOS

11 Análisis de los datos de secuencia

3.3 Estudio de la proteína NicX.

3.3.4.1 Conformación nativa de Nic

In order to illustrate the algorithms and computational complexity of MLE computation of our local marginal models and local conditional models, we use the Ising model with binary data as an example.

v

N_1

N_2

Figure 4.3: A small example for one-hop and two-hop local models

The graph above illustrates the one-hop and two-hop local models of node v. We assume each node takes binary values {0, 1}. Here we use N1 to denote the one-hop neighbours of v, N2 - the neighbours of neighbours of node v, and N1∪ N2 - the two-hop neighbours of v. Let p = |N1| and q = |N2|, so p + q = |N1∪ N2|.

4.4.1 One-hop Local Conditional Model

In the one-hop conditional models, the probability density function of Xv given its 1-hop neigh- bours XN1 is

f (xv|xN1, θ) =

exp(xvθv + xvxN1θv,N1)

where θv is a scale value, θv,N1 ∈ Rp, so the number of parameters in the function is p + 1. Given N sample points, we can write the negative pseudo log-likelihood function as follows:

l(θ) = N X i=1 [log(1 + exp(θv + xiN1θv,N1)) − x i vθv− xivxiN1θv,N1],

We use the limited-memory BFGS algorithm found in the Matlab package ”minFunc” of Schmidt

(2005) to compute the pseudo-likelihood estimates for each local conditional model. One can refer to

Nocedal (1980) and Schmidt et al.(2009) for the details about the algorithm. The BFGS algorithm approximates Newton’s method. We don’t need to evaluate the Hessian matrix, but the gradient of the log-likelihood is necessary. The gradient can be computed as follows:

dl(θ) dθv = PN i=1[ exp(θv+xiNvθv,N1) 1+exp(θv+xiN1θv,Nv)− x i v] dl(θ) dθv,N1 = PN i=1[ exp(θv+xiN1θv,N1) 1+exp(θv+xiN1θv,N1) − x i vxiN1]

The cost for evaluating the negative log-likelihood function and its gradient is linear to the number of parameters times sample size: O(N (p + 1)). As shown in Nocedal (1980) and Schmidt et al.(2009), the cost per iteration of L-BFGS method is O(m(p + 1)), where m is a small constant chosen by user, and p + 1 is the number of parameters in the log-likelihood function. In order to reach an accuracy of  under standard assumptions, one needs O(log(1/)) iterations. Therefore, the total cost for computing the MLE of a 1-hop local conditional model is O(log(1/[(m + N )(p + 1)]), which is linear to the number of parameters.

4.4.2 Two-hop Local Conditional Model

In the 2-hop local conditional model as shown in the previous example, there are some node parameters: θv ∈ R, θN1 ∈ R

p and some edge parameters θ

(v,N1) = {θij|i = v, j ∈ N1} ∈ R

p, θ(N1,N2) = {θij|i ∈ N1, j ∈ N2} ∈ Rq. The parameter set is therefore Θ = {θ

R(1+2p+q). The probability density function of Xv∪N1 given XN2 is f (xv∪N1|xN2, θ) = exp(xvθv + xN1θN1 + xvxN1θ(v,N1)+ xN1xN2θ(N1,N2)) P xv∪N1∈Iv∪N1 exp(xvθv+ xN1θN1 + xvxN1θ(v,N1)+ xN1xN2θ(N1,N2)) ,

Given N sample points, we can write the negative log-likelihood function:

l(θ) = n X i=1  log( X xv∪N1∈Iv∪N1 exp(xvθv + xN1θN1 + xvxN1θ(v,N1)+ xN1x i N2θ(N1,N2))) − (xivθv+ xiN1θN1 + x i vxiN1θ(v,N1)+ x i N1x i N2θ(N1,N2))  (4.4.1)

We use the same algorithm to compute the MLE as we did in the 1-hop local conditional model. Evaluating the negative log-likelihood function is, however, much more complex. The cost for computing the logarithm in the log-likelihood function is exponential to the size of v ∪ N1: O(2p+1). Since we need to compute this logarithm in the negative log-likelihood function and the gradient function, the cost for one data point will be O((1 + 2p + q)2p+1), and O(N (1 + 2p + q)2p+1) for N sample points. Similar to the 1-hop case, the total cost for computing the MLE of a 2-hop local conditional model is O(log(1/[m(1 + 2p + q) + N (1 + 2p + q)2p+1]), which is exponential in the size of v ∪ N1, or M1.

4.4.3 One-hop Local Marginal Model

Recall that when we complete the buffer set of each local marginal model, the number of pa- rameters increases exponentially with the number of nodes in the buffer set, but we only increase one clique in each local marginal model. Therefore, using the IPF algorithm designed by Jirousek

and Preucil (1995) to compute the MLE of the local marginal model turns out to be much more

effective than maximizing the likelihood function. After we get the expected value of the marginal contingency table, we can apply formula (2.15) provided in Letac et al. (2012) to get the MLE of

nature parameters θ: θj = X j0/j (−1)|S(j)−S(j0)|logp(j 0 ) p(0)

We don’t need to compute all the parameters in the local marginal model, since we just need the parameters {θj, v ∈ S(j)}. In our example we just need to compute θv, θ(v,N1), which costs O(p + 1).

Recall that M1 = v ∪ N1, and IMv denote the set of cells in the M1-marginal contingency table,

in the one-hop local marginal model of node v, so we have

|IM1| = 2p+1.

We need to update all the cell counts in the M1-marginal contingency table. Therefore the total cost for the IPF algorithm is O(2p+1), which is exponential to |M1|.

4.4.4 Two-hop Local Marginal Model

The two-hop local marginal model is almost the same as the one-hop, except that the two-hop local marginal model has 1 + p + q nodes and more cliques. In our experiments, we choose to use the IPF algorithm to get the expected values of the two-hop marginal contingency table, and then computed canonical parameters θv, θ(v,N1). The computational complexity is exponential to the

number of nodes in the local marginal model: |M2|.

We took advantage of Matlab’s matrix computation prowess to avoid multiple ”for-loops”. This allows us to update the contingency table m at a high speed, and the computational time to grow linearly with the number of cliques.