3.1 – Inocencia e Ignorancia blanca como herramientas de análisis

According to the description of dfwld arithmetic encoding in Section 6.1, in or-der that a source text be encoded and the code word for it subsequently decoded, the encoder has to read through the entire source text and compute the dfwld r associated with the source text, and the decoder has to wait until encoding is completed before beginning to decode.

Compare this train of events with the corresponding operation in the case of replacement encoding, in which each occurrence of a source letter is replaced by a binary word. In all forms of such encoding (including the adaptive varieties, which will be described in Chapter 8), encoding can begin as soon as scanning of the source text begins, and decoding can begin as soon as the beginnings of the code text are supplied to the decoder. Clearly there are situations in which it is highly desirable, or even indispensable, that encoding not wait upon the reading of the entire source text, nor decoding upon the delivery of the entire code text.

Can this apparent disadvantage of dfwld arithmetic coding be overcome?

Well, yes; for instance, one could limit the delays by resorting to encoding of blocks of source letters of a pre-set length, as discussed above, at the cost of providing a parallel counter or pointer stream, or an extra source letter, EOB, which would be discarded by the decoder.

Is there any way to take advantage of the partial code words, prefixes of the final code word, supplied to the decoder by the encoder through rescaling? Yes:

as noted at the end of Section 6.1.1, the decoder can deduce what the source text is right up to and including the last letter processed by the encoder if the decoder is supplied with the code text shifted out by rescaling together with the number of source letters processed so far. In Exercise 6.1.3 you were asked to struggle with the deduction process. Let us look now at what the decoder has to do.

Suppose that the code text supplied by rescaling after the scanning of the prefix W of the source text, say with N letters, is a binary word u. Recall from the introductory discussion of rescaling that u consists of the part of the

initial segments of the binary expansions of the endpoints of the interval A(W) where those binary expansions agree. [In case A(W) = [1−,1), this statement holds true if you take 1= (.11···)2.] That is, the binary expansion of the lower endpoint of A(W) looks like (.u0···)2, and the binary expansion of the upper endpoint is (.u1···)2. Therefore, provided no interval endpoints are dyadic fractions,(.u1)2 is the dfwld in A(W)! That is, given u and N, the decoder needs only to tack 1 on the end of u and decode normally.

Thus, in Exercise 6.1.3, in which the relative source frequencies make it unlikely that any interval endpoint other than 0 or 1 is a dyadic fraction, you take 010, tack on a 1 to get 0101, and decode normally by any of the methods of Section 6.1.1. [Note r= (.0101)2= 5/16 and N = 7.] You should get acdcaca.

You can check that this is correct by encoding acdcaca, with rescaling, to see if 010 is the partial code word provided by rescaling.

The problem of dyadic fraction interval endpoints is a nuisance, but can be overcome. As mentioned in the footnote on page 145, this would not be a problem if we made our intervals open on the left, closed on the right, and that is one way out. Even with intervals closed on the left, note that if(.u1)2is not the dfwld in A(W), then either (.u1)2is in A(W), in which case normal decoding of u1 gives the source word W , or(.u1)2is the upper endpoint of A(W). This second possibility can be checked for by the decoder, and adjustments can be made.

Although we will not dwell upon it here, this process of decoding from knowing N and the partial code supplied by rescaling can be adapted so that it proceeds right “on the heels” of encoding, with the decoder’s rescaling sweep-ing away old code and keepsweep-ing the eager decoder one source letter behind the encoder.

The great impediment to our happiness with this method of eager decoding is the necessity of supplying the source letter count N corresponding to the par-tial code supplied by rescaling. If N is to be conveyed by some pointer/counter stream parallel to the regular code stream, compression is seriously reduced.

Perhaps there are situations in which code can be delivered to the decoder in conformity with a certain rhythm, so that the decoder gets N by some sort of timing device; barring some such trick, this sort of decoding on the heels of encoding appears infeasible.

Another, more promising, path to allowing decoding to follow soon upon the start of encoding arises from the observation that in the decoding of Section 6.1, we need only decide in which of several large intervals(r − α)/ lies. We usually do not need to know(r − α)/ exactly, which means that we usually do not need to know r exactly.

What would happen if we replaced r by the approximationr obtained by truncating the binary expansion of r somewhere – i.e., if we tried to proceed using just the (current) first few bits of the (current) code stream? The ap-proximationr will be a little less than r , so(r− α)/ will be a little less than (r − α)/. If the latter is exactly equal to the lower endpoint of the interval

166 6 Arithmetic Coding

A(sk) = [

j<k fj,

j≤k fj) in which it lies (so that we ought to decode sk) then we are doomed: (r− α)/ will be in the next interval down, we will wrongly decode sk−1, the next “current interval” will be wrong, and we will be in a world of trouble. The same catastrophe will occur if(r − α)/ is not equal to the lower endpoint of A(sk), but is very close to it, and r is not close enough to r to put(r− α)/ in A(sk).

These catastrophes could be avoided at some cost in compression if we roughened the arithmetic coding process by putting some space – an “error zone” – between the intervals into which the current interval is subdivided. That is, the initial intervals A(s1),..., A(sm) would not cover [0,1), and subsequent subdivisions would be similar to the first. Then we can proceed to decode fear-lessly, replacing r byr , provided we have figured out how far to take the binary expansion of r , to obtainr , so as to ensure that whenever(r − α)/ is in A(sk), 1< k ≤ m, then (r− α)/ will be greater than the upper endpoint of A(sk−1).

This roughening is somehow, happily, built into the algorithm of Section 6.4, but not explicitly. The algorithm is a discrete simulation of pure dfwld arithmetic coding which corrects all the defects of the pure process that we have discussed here, at a controllable cost.

Exercises 6.3

1. Suppose that S= {a,b,c,d}, and fa= .35, fb= .3, fc= .25, and fd = .1, as in Exercise 6.1.1. Find the dyadic fractions fa, fb, fc, and fd with common denominator 16, adding up to 1, such that( fa, fb, fc, fd) is as close as possible to( fa, fb, fc, fd). (Take “as close as possible” to mean that

s∈S| fs− fs| is minimized.)

Redo Exercise 6.1.1 using the fs as the relative source frequencies. Does rescaling do much to curb the growth of the denominators of the interval lengths?

2. S= {a,b,c,d}, fa= .4, fb= .3, fc= .2, and fd= .1. The encoder, rescal-ing whenever possible, passes to the decoder the followrescal-ing information, one line at a time (λ stands for the empty string):

Number of source letters processed

New bits added to the code stream by rescaling

1 λ

2 10

3 λ

4 1010

5 110

6 λ

7 λ

Decode on the run, on the heels of the encoding, as best you can. (Note that the code string, with N= 7, stands at 101010110, so you can always check your work by decoding 1010101101, with N= 7.)

In document UNIVERSIDAD COMPLUTENSE DE MADRID (página 174-193)