• No se han encontrado resultados

UNIONES A TIERRA

4. CALCULOS JUSTIFICATIVOS

4.5. UNIONES A TIERRA

Stemming from Xie (2008) and Xie et al. (2008), we propose a penalized parametric approach for collective anomaly detection by extending the fixed-background model. With respect to the mentioned methods, our approach relaxes the constraints on com- ponent covariance matrices to be arbitrary positive definite. For the sake of simplicity, and since our semi-supervised approach to estimate fSB first requires estimation of fB

in the sense of a standard model-based clustering problem, the proposed penalisation is first illustrated in the unsupervised setting.

The proposed procedure makes use of two penalty functions: one for the component mean parameters and one for the component covariance matrix eigenvalues. Hereafter, the method is referred to as Mean and Eigenvalue Shrinkage Penalization (MESP).

Chapter 3 - Penalized anomaly detection 43 The first considered penalty of the MESP is then

p1(θθθ) = P X p=1 v u u t K X k=1 πkµ2kp. (3.9)

It borrows the idea of the grouped shrinkage proposed by Xie et al. (2008) and takes advantage of the simultaneous shrinkage of component parameters (as illustrated in Equation 3.8). However, for the problem at hand, the approach should be sensitive to a precise estimation of infrequent component parameters. According to B¨uhlmann and Van De Geer (2011), if true proportions of components differ substantially, penalty function should be appropriately weighted to balance an influence of unequal propor- tions. For this reason, the proposed penalty (3.9) is also a function of the component proportions πk which serve as weights. As a consequence, the penalised parameters are

appropriately encouraged to the shrinkage, not mostly the one corresponding to the rare components.

In Xie (2008) the covariance matrices are constrained to be component-specific diag- onal, and the proposed penalty is a function of the matrix diagonal terms. The second penalty of the MESP depends on the component covariance matrix eigenvalues so that the covariance matrices are not specifically constrained, but they are just component- specific positive definite. Another direction would explore the idea of matrix low-rank approximations by shrinkage of their smallest eigenvalues to 0. However, a matrix with null eigenvalues is not positive definite and would force to use generalised Gaussian dis- tribution and pseudo-determinants in order to write the likelihood of the mixture model. However, optimization of such objective function tends to be unstable and burdensome to perform. In order to circumvent this problem, we propose to shrink the eigenvalues to a component-specific small positive value k > 0. In this way, the expected regular-

isation is performed, the likelihood can be written explicitly, and the EM algorithm is prevented from running into the likelihood singularities. For this approach, if the Lk

smallest eigenvalues for the kth component are shrunk to 

k, then the number of model

parameters is decreased by PK

k=1(Lk− 1)(Lk+ 2)/2.

Let us consider the eigenvalue decomposition for the kthcomponent covariance matrix

Σk = QkDkQ0k where Dk is a diagonal matrix of eigenvalues and Qk is composed of

orthonormal eigenvectors. Let us denote by δkp the pth largest value of Dk. The second

penalty of the MESP is formulated as:

p2(θθθ) = K X k=1 P X p=1 max(δkp, k). (3.10)

44 Section 3.5 - A penalized approach in mixture models Selection of k is performed based on an asymptotic distribution of the eigenvalues

(Eaton, 2007). Assuming that the L smallest eigenvalues of the population covariance matrix Σk is equal to δconst, the asymptotic distribution of the L smallest unsorted

eigenvalues δkl of the sample covariance matrix is normal with a mean δconst and a

variance 2δ2const

nL . Mean of the L smallest eigenvalues of the respective sample covariance

matrix ˆk= 1nPPp=P −L+1δkp is then an unbiased estimator of δconst.

The parameter Lk is selected based on sequential tests. The tests partially use the

same data between iterations, hence a Bonferroni-like correction is applied to control the type I error (Bonferroni, 1936). Denote by ¯δk,h an average of Lk = P − h smallest

eigenvalues of the kth component sample covariance matrix. Starting from h = 0, it is

tested in sequence if the Lk = P − h smallest eigenvalues are equal to ¯δk,h against a

general alternative (at least one eigenvalue is different). Rejection regions for the tested hypothesis are determined as

δkh ¯ δk > 1 r 2 n ∗ z α 2(h+1) ∨ δkP ¯ δk < 1 + r 2 n ∗ z α 2(h+1) where z α 2(h+1) is the α

2(h+1) quantile of the Gaussian random variable. In shorthand, the

null hypothesis is rejected if a ratio of the largest eigenvalue and the mean of eigenvalues is too large or a ratio of the smallest eigenvalue and the mean is to low. If there is no reason to reject the null, then we assign the parameter Lk to P− h. Otherwise, we take

the alternative hypothesis that the eigenvalues are different. Then, in a next iteration, it is assumed that the largest eigenvalue is too large, and the test is performed again with the parameter h increased by 1. With iterations the rejection regions get larger according to the type-I error correction of the sequential test. The iterations are repeated until there is no reason to reject the null.

In summary, the MESP appraoch makes use of two penalties expressed in Equation 3.9 and 3.10. The parameters estimation is performed via optimization of the following penalized likelihood with the specific adjustments for parameter selection

log Lp(θθθ) = n X i=1 log " K X k=1 πkφk(xxxi|µµµk,Σk) # − γ1 P X p=1 v u u t K X k=1 πkµ2kp− γ2 K X k=1 P X p=1 max(δkp, k). (3.11)

The solution of Equation 3.11 is obtained via a suitable modification of the EM al- gorithm. With respect to the unpenalized approach, the maximisation step for the penalised parameters is changed. Due to the shrinkage, these estimates are shifted with respect to the MLE toward the fixed values (0 in case of the means and k for the

Chapter 3 - Penalized anomaly detection 45 in Appendix A.

Documento similar