1A) PARCHES DE NICOTINA
5. PREGUNTAS Y COMENTARIOS FRECUENTES EN LA CONSULTA
Consider a discrete time series{u(t1), u(t2), . . . , u(tN)} comprising N observations of a stochastic process. The joint probability distribution function (pdf ) of{u(tk)}Nk=1 describes the probability that the random variable U takes on the values U = u(tk) jointly in a sequence of samples.
A stochastic process is said to be stationary, in the strict sense, if the joint pdf associated with the N observations taken at t1, t2, . . . , tN is identical to the joint pdf associated with another set of N observations taken at times t1+ k, t2+ k, . . . , tN+ k for integers N and k. We will assume in this book that we deal with second order stationary processes, which means that the mean and covariance, to be explained below, do not depend on time. We will refer to such processes as simply stationary.
We will now present some important properties of a stationary time series. The first property that is of interest in a stationary stochastic series is the mean. The mean of a stationary process is defined as
µu=E (u) =
∞ −∞
up(u) du (6.20)
where, p stands for probability distribution function. We will refer to the above sum as the statistical average. To calculate using the above formula, a lot of information, such as the pdf p(u), is required. This, in turn, assumes several realizations of the random signal. But in reality, we usually have only one realization of the random signal. In view of this, one works with an estimate of the mean, given by the following equation: mu= 1 2N + 1 N n=−N u(n) (6.21)
Thus, the estimate of the mean is just the average. In other words, we estimate a statistical average with a time average. The estimate becomes more accurate when N is chosen large.
The next property of interest is variance, which gives a measure of variation of data from the mean. The variance of a stationary process is defined as
σu2=E ((u(k) − µu)2) =
∞ −∞
(u(k)− µu)2p(u) du (6.22)
As mentioned above, when knowledge of the pdf is not available, one looks for a simpler way to calculate it. The estimate of the variance, given by
ˆ σu2= 1 2N N k=−N (u(k)− mu)2 (6.23)
comes to the rescue. Note that 2N is one less than the number of terms being summed. We will next discuss the auto covariance function (ACF ) that helps us understand the interdependence of samples of time series. The ACF for a general stochastic time series is defined as γ(k, j) = E ((u(k) − µu(k))(u(j)− µu(j))). For a stationary
6.3. Covariance 167 time series, the mean is constant and the dependence is only a function of the lag
l = k− j. In view of this, for a stationary stochastic process, we obtain
γuu(k, j) = γuu(l) =E ((u(k) − µu)(u(k− l) − µu)) (6.24) As in the case of the mean, the estimate of ACF is given by
ruu(l) = 1 2N N k=−N (u(k)− mu)(u(k− l) − mu) (6.25)
Note that we need only one realization to calculate the above sum. The ACF is used in detecting the underlying process, i.e., whether it is periodic, integrating, independent, etc. We present a detailed study of a periodic process in Sec. 6.3.3. In Sec. 6.3.4, we show that the ACF takes the largest value at lag l = 0.
In practice, a normalized function, known as the auto correlation function, ρuu(l) =
γuu(l)
γuu(0)
(6.26) is used. We abbreviate the auto correlation function also as ACF – the context will explain what is intended.
It is easy to verify the following symmetry properties: γuu(l) = γuu(−l)
ruu(l) = ruu(−l)
ρuu(l) = ρuu(−l)
(6.27)
Now we illustrate these ideas with a finite length sequence.
Example 6.5 Find the ACF of the sequence{u(n)} = {1, 2}.
Because there are only two nonzero elements, we obtain N = 2. First, we determine an estimate of the mean, given by Eq. 6.21:
mu= 1 2 1 k=0 u(k) = 1 2(u(0) + u(1)) = 1.5 Next, using Eq. 6.25, we calculate the ACF for every lag:
ruu(0) = 1 k=0 (u(k)− 1.5)2= (−0.5)2+ 0.52= 0.5 ruu(1) = 1 k=0 (u(k)− 1.5)(u(k − 1) − 1.5) = (u(1)− 1.5)(u(0) − 1.5) = 0.5 × (−0.5) = −0.25 ruu(−1) = 1 k=0 (u(k)− 1.5)(u(k + 1) − 1.5) = (u(0)− 1.5)(u(1) − 1.5) = (−0.5) × 0.5 = −0.25
168 6. Identification
Thus, we see that ruu(n) = {−0.25, 0.5, −0.25}, where the starting value of
−0.25 corresponds to n = −1. The Matlab command xcov carries out these calculations. We can scale this sequence by dividing by ruu(0) to arrive at ρuu(n) =
{−0.5, 1, −0.5}.
We can also obtain this result using the Matlab command xcov with the optional parameter coeff enabled. M 6.2 carries out this calculation. It can be seen that the symmetry properties of both r and ρ are satisfied.
The calculations are easier if we work with sequences of zero mean. Zero mean equivalent of a sequence is obtained by subtracting the mean from every entry of the sequence. For example, by subtracting the mean of u discussed in the above example from every entry, we obtain {−0.5, 0.5}. The ACF of this sequence is identical to that obtained in the above example. Problem 6.5 shows that working with zero mean sequences could help prevent some mistakes.2
Now we present the concept of cross covariance function (CCF). It is a measure of dependence between samples of two time series. The CCF of two stationary time series u and y is given by
γuy(l) =E ((u(k) − µu)(y(k− l) − µy)) (6.28)
while its estimate is given by
ruy(l) = 1 2N N k=−N (u(k)− mu)(y(k− l) − my) (6.29)
Thus, to calculate ruy(l), l > 0, we need to shift y by l time points to the right, or equivalently, introduce l zeros in front, multiply the corresponding elements and add. For l < 0, we need to shift y to the left.
When the CCF between two sequences is large, we say that they are correlated, when it is small, we say that they are less correlated and when it is small, we say that they are uncorrelated. When two signals u and y are completely uncorrelated, we obtain
γuy(l) = 0, ∀l
ruy(l) = 0, ∀l
(6.30) We would also say that u and y are independent, although usually this word refers to a stricter condition. The CCF assumes the largest value when two time series have the strongest correlation. This has wide application in determining the time delay of a system, an important step in identification, see Sec. 6.3.4.
For negative arguments, the CCF result is somewhat different from that of the ACF, given by Eq. 6.27. It is easy to verify the following property of the CCF:
γuy(l) = γyu(−l)
ruy(l) = ryu(−l)
(6.31) 2In view of this observation, we will assume all sequences to be of zero mean in the rest of this
6.3. Covariance 169 The largest value occurs at the lag where the dependency is strongest. Suppose that u and y refer to the input to and the output from an I/O LTI system. If the system is causal, the current output cannot be correlated with a future input. As a result, we obtain the following relationship:
γuy(l) = γyu(−l) = 0, ∀l > 0
ruy(l) = ryu(−l) = 0, ∀l > 0