LA APROXIMACIÓN AL OBJETO DE ESTUDIO - ASPECTOS METODOLÓGICOS

CAPÍTULO III. ASPECTOS METODOLÓGICOS

3.1 LA APROXIMACIÓN AL OBJETO DE ESTUDIO

One of the basic laws of physics, the second law of thermodynamics, states that the entropy of an isolated system is nondecreasing. We now explore the relationship between the second law and the entropy function that we deﬁned earlier in this chapter.

In statistical thermodynamics, entropy is often deﬁned as the log of the number of microstates in the system. This corresponds exactly to our notion of entropy if all the states are equally likely. But why does entropy increase?

We model the isolated system as a Markov chain with transitions obey- ing the physical laws governing the system. Implicit in this assumption is the notion of an overall state of the system and the fact that knowing the present state, the future of the system is independent of the past. In such a system we can ﬁnd four different interpretations of the second law. It may come as a shock to ﬁnd that the entropy does not always increase. However,relative entropy always decreases.

1. Relative entropyD(µn||µ_n)decreases withn. Letµnandµ_nbe two probability distributions on the state space of a Markov chain at time

n, and letµn+1 andµ_n₊1 be the corresponding distributions at time n+1. Let the corresponding joint mass functions be denoted by

p and q. Thus, p(xn, xn+1)=p(xn)r(xn+1|xn) and q(xn, xn+1)= q(xn)r(xn+1|xn), where r(·|·) is the probability transition function

for the Markov chain. Then by the chain rule for relative entropy, we have two expansions:

D(p(xn, xn₊1)||q(xn, xn+1))=D(p(xn)||q(xn))

=D(p(xn+1)||q(xn+1))

+D(p(xn|xn+1)||q(xn|xn+1)).

Since both p and q are derived from the Markov chain, the conditional probability mass functions p(xn+1|xn) and q(xn+1|xn) are

both equal tor(xn+1|xn), and henceD(p(xn+1|xn)||q(xn+1|xn))=0.

Now using the nonnegativity of D(p(xn|xn+1)||q(xn|xn+1))(Corol-

lary to Theorem 2.6.3), we have

D(p(xn)||q(xn))≥D(p(xn+1)||q(xn+1)) (4.44)

D(µn||µn)≥D(µn+1||µ_n₊1). (4.45)

Consequently, the distance between the probability mass functions is decreasing with time nfor any Markov chain.

An example of one interpretation of the preceding inequality is to suppose that the tax system for the redistribution of wealth is the same in Canada and in England. Then if µn and µ_n represent the distributions of wealth among people in the two countries, this inequality shows that the relative entropy distance between the two distributions decreases with time. The wealth distributions in Canada and England become more similar.

2. Relative entropyD(µn||µ)between a distributionµnon the states at

time n and a stationary distribution µdecreases with n. In (4.45),

µ_n is any distribution on the states at time n. If we let µ_n be any stationary distribution µ, the distribution µ_n₊₁ at the next time is also equal to µ. Hence,

D(µn||µ)≥D(µn+1||µ), (4.46)

which implies that any state distribution gets closer and closer to each stationary distribution as time passes. The sequence D(µn||µ)

is a monotonically nonincreasing nonnegative sequence and must therefore have a limit. The limit is zero if the stationary distribution is unique, but this is more difﬁcult to prove.

3. Entropy increases if the stationary distribution is uniform. In gen- eral, the fact that the relative entropy decreases does not imply that the entropy increases. A simple counterexample is provided by any Markov chain with a nonuniform stationary distribution. If we start

this Markov chain from the uniform distribution, which already is the maximum entropy distribution, the distribution will tend to the stationary distribution, which has a lower entropy than the uniform. Here, the entropy decreases with time.

If, however, the stationary distribution is the uniform distribution, we can express the relative entropy as

D(µn||µ)=log|X| −H (µn)=log|X| −H (Xn). (4.47)

In this case the monotonic decrease in relative entropy implies a monotonic increase in entropy. This is the explanation that ties in most closely with statistical thermodynamics, where all the microstates are equally likely. We now characterize processes having a uniform stationary distribution.

Deﬁnition A probability transition matrix [Pij], Pij =Pr{Xn₊1=

j|Xn=i}, is calleddoubly stochastic if

i Pij =1, j =1,2, . . . (4.48) and j Pij =1, i =1,2, . . . . (4.49)

Remark The uniform distribution is a stationary distribution of P if

and only if the probability transition matrix is doubly stochastic (see Problem 4.1).

4. The conditional entropy H (Xn|X1) increases with n for a station-

ary Markov process. If the Markov process is stationary, H (Xn) is

constant. So the entropy is nonincreasing. However, we will prove that H (Xn|X1) increases with n. Thus, the conditional uncertainty

of the future increases. We give two alternative proofs of this result. First, we use the properties of entropy,

H (Xn|X1)≥H (Xn|X1, X2) (conditioning reduces entropy)

(4.50)

=H (Xn|X2) (by Markovity) (4.51)

Alternatively, by an application of the data-processing inequality to the Markov chain X1 →Xn−1→Xn, we have

I (X1;Xn−1)≥I (X1;Xn). (4.53)

Expanding the mutual informations in terms of entropies, we have

H (Xn−1)−H (Xn−1|X1)≥H (Xn)−H (Xn|X1). (4.54)

By stationarity, H (Xn−1)=H (Xn), and hence we have

H (Xn−1|X1)≤H (Xn|X1). (4.55)

[These techniques can also be used to show that H (X0|Xn) is

increasing in nfor any Markov chain.]

5. Shuffles increase entropy. If T is a shuffle (permutation) of a deck of cards and X is the initial (random) position of the cards in the deck, and if the choice of the shuffle T is independent ofX, then

H (T X)≥H (X), (4.56) where T X is the permutation of the deck induced by the shufﬂe T

on the initial permutation X. Problem 4.3 outlines a proof.

In document Vejez y mercado de trabajo en la ciudad de Monterrey: un análisis a partir de las trayectorias laborales de los adultos mayores (página 108-110)