• No se han encontrado resultados

2.1 Los Hedge Funds

2.1.1 Descripción general e historia 7

mean estimator of x is just the complex conjugate of the conditional mean estimator of x. Thus it makes no sense to consider the estimation of x from y and y, since this brings no refinement to the conditional mean estimator. As we shall see, this statement is decidedly false when the estimator is constrained to be a (widely) linear or (widely) linear-quadratic estimator.

5.3 Linear MMSE estimation

We shall treat the problem of linearly or widely linearly estimating a signal from a mea-surement as a virtual two-channel estimation problem, wherein the underlying exper-iment consists of generating a composite vector of signal and measurement, only one of which is observed.1 Our point of view is that the augmented covariance matrix for this composite vector then encapsulates all of the second-order information that can be extracted. In order to introduce our methods, we shall begin here with the Hermitian case, wherein complementary covariances are ignored, and then extend our methods to include Hermitian and complementary covariances in the next section.

Let us begin with a signal (or source) x: −→ Cnand a measurement vector y: −→

Cm. There is no requirement that the signal dimension n be smaller than the measurement dimension m. We first assume that the signal and measurement have zero mean, but remove this restriction later on. Their composite covariance matrix is the matrix

Rx y = E

x y

 xH yH

=

Rx x Rx y

RHx y Ryy



. (5.13)

The error between the signal x and the linear estimator ˆx= Wy is e = ˆx − x and the error covariance matrix is Q= E[(ˆx − x)(ˆx − x)H]. This error covariance is

Q= E[(Wy − x)(Wy − x)H]= WRyyWH− Rx yWH− WRHx y + Rx x. (5.14) After completing the square, this may be written

Q= Rx x − Rx yR−1yyRHx y+ (W − Rx yR−1yy)Ryy(W− Rx yR−1yy)H. (5.15) This quadratic form in W is positive semidefinite, so Q≥ Rx x − Rx yR−1yyRHx y with equality for

W= Rx yR−1yy and Q= Rx x − Rx yR−1yyRHx y. (5.16) The solution for W may be written as the solution to the normal equations WRyyRx y= 0, or more insightfully as2

E[(Wy− x)yH]= 0. (5.17)

Thus the orthogonality principle is at work: the estimator error e= Wy − x is orthogonal to the measurement y, as illustrated in Fig.5.1. There H (y) denotes the Hilbert space of measurement vectors y and orthogonality means E[(Wy− x)iyj]= 0 for all pairs (i, j).

Moreover, the linear operator W is a projection operator, since the LMMSE estimator of x from Wy remains Wy. The LMMSE estimator is sometimes referred to as the

122 Estimation

H(y) Wy

x e

Figure 5.1 Orthogonality between the error e= Wy − x and H(y) in LMMSE estimation.

(c)

Figure 5.2 The signal-plus-noise channel model: (a) channel, (b) synthesis, and (c) analysis.

(discrete) Wiener filter, which explains our choice of notation W. However, the LMMSE estimator predates Wiener’s work on causal LMMSE prediction and smoothing of time series.

5.3.1 The signal-plus-noise channel model

We intend to argue that it is as if the source and measurement vectors were drawn from a linear channel model y= Hx + n that draws a source vector x, linearly filters the source vector with a channel filter H, and adds an uncorrelated noise vector n to produce the measurement vector. This scheme is illustrated with the signal-plus-noise channel model of Fig.5.2(a). According to this channel model, the source and measurement have the synthesis and analysis representations of Figs.5.2(b) and (c):

x

The synthesis model of Fig.5.2(b) and the analysis model of Fig.5.2(c) produce these block LDU (Lower triangular–Diagonal–Upper triangular) Cholesky factorizations:

Rx y =

5.3 Linear MMSE estimation 123

For this channel model and its corresponding analysis and synthesis models to work, we must choose

H= RHx yR−1x x and Rnn = Ryy− RHx yR−1x xRx y. (5.22) The noise covariance matrix Rnnis the Schur complement of Rx x within the composite covariance matrixRx y. Thus, up to second order, every virtual two-channel estimation problem is a problem of representing the measurement as a noisy and linearly filtered version of the signal, which is to say that the LDU Cholesky factorization of the com-posite covariance matrixRx yhas actually produced a channel model, a synthesis model, and an analysis model for the source and measurement.

From these block Cholesky factorizations we can also extract block UDL (Upper triangular–Diagonal–Lower triangular) Cholesky factorizations of inverses:

R−1x y =

It follows that R−1nn is the southeast block ofR−1x y. The block Cholesky factors, in turn, produce these factorizations of determinants:

detRx y = det Rx xdet Rnn, (5.25) detR−1x y = det R−1x x det R−1nn. (5.26) Let us summarize. From the synthesis model we say that every composite source and measurement vector [xT, yT]Tmay be modeled as if it were synthesized in a virtual two-channel experiment, wherein the source is the unobserved two-channel and the measurement is the observed channel y= Hx + n. Of course, the filter and the noise covariance must be chosen just right. This model then decomposes the composite covariance matrix for the source and measurement, and its inverse, into block Cholesky factors.

5.3.2 The measurement-plus-error channel model

We now interchange the roles of the signal and measurement in order to obtain the measurement-plus-error channel model, wherein the source x is produced as a noisy measurement of a linearly filtered version of the measurement y. That is, x= Wy − e, as illustrated in Fig.5.3(a). It will soon become clear that our choice of W to denote this filter is not accidental. It will turn out to be the LMMSE filter.

According to this channel model, the source and measurement have the representations of Figs.5.3(b) and (c),

124 Estimation

Figure 5.3 The measurement-plus-error channel model: (a) channel, (b) synthesis, and (c) analysis

where the error e and the measurement y are uncorrelated. The synthesis model of Fig.5.3(b) and the analysis model of Fig.5.3(c) produce these block UDL Cholesky factorizations:

For this channel model and its corresponding analysis and synthesis models to work, we must choose W as the LMMSE filter and Q= E[eeH] as the corresponding error covariance matrix:

W= Rx yR−1yy and Q= Rx x− Rx yR−1yyRHx y. (5.31) The error covariance matrix Q is the Schur complement of RyywithinRx y. Thus, up to second order, every virtual two-channel estimation problem is a problem of representing the signal as a noisy and linearly filtered version of the measurement, which is to say that the block UDL Cholesky factorization of the composite covariance matrix Rx y has actually produced a channel model, a synthesis model, and an analysis model for the source and measurement. The orthogonality between the estimator error and the measurement is expressed by the northeast and southwest zeros of the composite covariance matrix for the error e and the measurement y.

From these block Cholesky factorizations of the composite covariance matrixRx ywe can also extract block LDU Cholesky factorizations of inverses:

R−1x y =

It follows that Q−1is the northwest block ofR−1x y. The block Cholesky factors produce the following factorizations of determinants:

detRx y = det Q det Ryy, (5.34)

detR−1x y = det Q−1det R−1yy. (5.35)

5.3 Linear MMSE estimation 125

Let us summarize. From the analysis model we say that every composite signal and measurement vector [xT, yT]T may be modeled as if it were a virtual two-channel experiment, wherein the signal is subtracted from a linearly filtered measurement to produce an error that is orthogonal to the measurement. Of course, the filter and the error covariance must be chosen just right. This model then decomposes the composite covariance matrix for the signal and measurement, and its inverse, into block Cholesky factors.

5.3.3 Filtering models

We might say that the models we have developed give us two alternative parameteriza-tions:

r the signal-plus-noise channel model (Rx x, H, Rnn) with H= RHx yR−1x x and Rnn= Ryy− RHx yR−1x xRx y; and

r the measurement-plus-error channel model (Ryy, W, Q) with W = Rx yR−1yy and Q= Rx x − Rx yR−1yyRHx y.

These correspond to the two factorizations (A1.2) and (A1.1) ofRx y, and Rnn and Q are the Schur complements of Rx x and Ryy, respectively, within Rx y. Let’s mix the synthesis equation for the signal-plus-noise channel model with the analysis equation of the measurement-plus-error channel model to solve for the filter W and error covariance matrix Q in terms of the channel parameters H and Rnn:

e

This composition of maps produces these factorizations:

Q 0 We now evaluate the northeast block of (5.37) and the southwest block of (5.38) to obtain two formulae for the filter W:

W= Rx xHH(HRx xHH+ Rnn)−1= (R−1x x + HHR−1nnH)−1HHR−1nn. (5.39) In a similar fashion, we evaluate the northwest blocks of both (5.37) and (5.38) to get two formulae for the error covariance Q:

Q= Rx x − Rx xHH(HRx xHH+ Rnn)−1HRx x = (R−1x x + HHR−1nnH)−1. (5.40) These equations are Woodbury identities, (A1.43) and (A1.44). They determine the LMMSE inversion of the measurement y for the signal x. In the absence of noise, if the signal and measurement were of the same dimension, then W would be H−1, assuming that this inverse existed. However, generally WH= I, but is approximately so if Rnnis

126 Estimation

small compared with HRx xHH. The LMMSE estimator may be written as

Wy= W(Hx + n) = WHx + Wn. (5.41)

So, the LMMSE estimator W, sometimes called a deconvolution filter, decidedly does not equalize H to produce WH= I. Rather, it approximates I so that the error e = Wy − x = (WH− I)x + Wn, with covariance matrix Q = (WH − I)Rx x(WH− I)H+ WRnnWH provides the best tradeoff between model-bias-squared (WH− I)Rx x(WH− I)H and filtered noise variance WRnnWHto minimize the error covariance Q. The importance of these results cannot be overstated.

Let us summarize. Every problem of LMMSE estimation requiring only first- and second-order moments can be phrased as a problem of estimating one unobserved channel from another observed channel in a virtual two-channel experiment. In this virtual two-channel experiment, there are two different channel models.

The first channel model says it is as if the measurement were a linear combination of filtered source and uncorrelated additive noise. The second channel model says it is as if the source vector were a linear combination of the filtered measurement and the estimator error.

The first channel model produces a block LDU Cholesky factorization for the com-posite covariance matrixRx y and a block UDL Cholesky factorization for its inverse.

The second channel model produces a block UDL Cholesky factorization for the com-posite covariance matrixRx y and a block LDU Cholesky factorization for its inverse.

By mixing these two factorizations, two different solutions are found for the LMMSE estimator and its error covariance.

Example 5.2. In many problems of signal estimation in communication, radar, and sonar, or imaging in geophysics and radio astronomy, the measurement model may be taken to be the rank-one linear model

y= ␺x + n, (5.42)

where␺ ∈ Cmis the channel vector that carries the unknown complex signal amplitude x to the measurement, x is a zero-mean random variable with varianceσx2, and Rnnis the covariance matrix of the noise vector n. Using the results of the previous subsections, we may write down the following formulae for the LMMSE estimator of x and its mean-squared error:

ˆx= σx2Hx2␺␺H+ Rnn)−1y= (σx−2+ ␺HR−1nn␺)−1HRnn−1y, (5.43) Q= σx2− σx4Hx2␺␺H+ Rnn)−1␺ = 1

σx−2+ ␺HR−1nn. (5.44) In both forms the estimator consists of a matrix inverse operator, followed by a correlator, followed by a scalar multiplication. When the channel filter is a vector’s worth of a geometric sequence,␺ = [1, ejθ, . . ., ejmθ]T, as in a uniformly sampled complex

5.3 Linear MMSE estimation 127

exponential time series or a uniformly sampled complex exponential plane wave, then the correlation step is a correlation with a discrete-time Fourier-transform (DTFT) vector.

5.3.4 Nonzero means

We have assumed until now that the signal and measurement have zero means. What if the signal has known mean␮xand the measurement has known mean␮y? How do these filtering formulae change? The centered signal and measurement x− ␮x and y− ␮y

then share the composite covariance matrixRx y. So the LMMSE estimator of x− ␮x

from y− ␮yshould obey all of the equations already derived. That is,

ˆx− ␮x = W(y − ␮y)⇔ ˆx = W(y − ␮y)+ ␮x. (5.45) But what about the orthogonality principle which says that the error between the estimator and the signal is orthogonal to the measurement? This is still so due to

E[(ˆx− x)yH]= E{[ˆx − ␮x− (x − ␮x)](y− ␮y)H} + E{[ˆx − ␮x− (x − ␮x)]␮Hy} = 0.

(5.46) The first term on the right is zero due to the orthogonality principle already established for zero-mean LMMSE estimators. The second term is zero because ˆx is an unbiased estimator of x, i.e., E[ˆx]= ␮x.

5.3.5 Concentration ellipsoids

Let’s call Bx x = {x: xHR−1x xx= 1} the concentration ellipsoid for the signal vector x.

For scalar x, this is the circle Bx x = {x: |x|2= Rx x}. The concentration ellipsoid for the error vector e is Bee = {e: eHQ−1e= 1}. From the equation Q = Rx x − Rx yR−1yyRHx y we know that Rx x ≥ Q and hence R−1x x ≤ Q−1. The posterior concentration ellipsoid Beeis therefore smaller than, and completely embedded within, the prior concentration ellipsoid Bx x. When the signal and measurement are jointly Gaussian, these ellipsoids are probability-density contour lines (level curves).

Among measures of effectiveness for LMMSE are relative values for the trace and determinant of Q and Rx x. Various forms of these may be derived from the matrix identities obtained from the channel models. They may be given insightful forms in canonical coordinate systems, as we will see in Section5.5. One particularly illuminating formula is the gain of LMMSE filtering, which is closely related to mutual information (cf. Section5.5.2):

det Rx x

det Q =det Ryy

det Rnn

. (5.47)

So the gain of LMMSE estimation is the ratio of the volume of the measurement concentration ellipsoid to the volume of the noise concentration ellipsoid.

128 Estimation

Example 5.3. Let’s consider the problem of linearly estimating the zero-mean signal x from the zero-mean measurement y when the measurement is in fact the complex conjugate of the signal, namely y= x. The MMSE estimator is obviously the widely linear estimator ˆx= yand its error covariance matrix is Q= 0.

The composite covariance matrix for x and x is the augmented covariance matrix Rx x. From the structure of this matrix, namely

Rx x = good at all. But, for improper signals, the error concentration ellipsoid for the error e= ˆx − x lies inside the concentration ellipsoid for x.

5.3.6 Special cases

Two special cases of LMMSE estimation warrant particular attention: the pure signal-plus-noise case and the Gaussian case.

Signal plus noise

The pure signal-plus-noise problem is y= x + n, with H = I and x and n uncorrelated.

As a consequence, Rx y = Rx x, Ryy= Rx x + Rnn, and the composite covariance matrix

The prior signal-plus-noise covariance is the series formula Ryy = Rx x+ Rnn, but the posterior error covariance is the parallel formula Q= (R−1x x + Rnn−1)−1. The gain of LMMSE filtering is

det Rx x

det Q =det(Rx x+ Rnn)

det Rnn = det(I + R−1/2nn Rx xRnn−H/2). (5.49) The matrix R−1/2nn Rx xR−H/2nn can reasonably be called a signal-to-noise-ratio matrix.

These formulae are characteristic of LMMSE problems.

The Gaussian case

Suppose the signal and measurement are multivariate Gaussian, meaning that the com-posite vector [xT, yT]Tis multivariate normal with zero mean and composite covariance matrixRx y. Then the conditional pdf for x, given y, is

p(x|y) = p(x, y)

Documento similar