4. RESULTADOS Y DISCUSIÓN
4.1. DESARROLLO HISTÓRICO
4.1.4 GUERRA FRÍA (1947-1991)
-U( ) P( )
x ( n ) xe(n)
xo(n) a ( n )
d ( n )
+
Figure 2.7: Inverse lifting operations.
Lifting scheme for 5/3 filters
a(0)(l) = x(2l), d(0)(l) = x(2l + 1),
d(l) = d(0)(l) − ⌊(a(0)(l)+a2(0)(l+1))⌋, and , a(l) = a(0)(l) + ⌊(d(l)+d(l−1))
4 ⌋.
(2.154)
2.3 Transforms in 2-D Space
As images are functions in a 2-D space, 2-D transforms are more relevant to our discussion. In fact, all the above transforms in 1-D could be extended to 2-D, if the basis functions are separable in each dimension [52]. This implies that a function b(x, y) can be expressed as a product of two 1-D functions of variables along each dimension (e.g., b(x, y) = φ(x)ψ(y)).
This is also true for discrete transforms. A finite 2-D discrete function f (m, n), 0 ≤ m < M, 0 ≤ n < N can also be represented by a matrix of dimension M × N. Let Φ = {φi(m)|0 ≤ i, m < M} and Ψ = {ψj(n)|0 ≤ j, n < N } be two sets of orthonormal discrete functions in 1-D. It could be easily shown that Ω = {ωij(m, n) = φi(m)ψj(n)|0 ≤ i, m < M, 0 ≤ j, n < N}
form the orthonormal basis of the 2-D discrete functions of dimension M × N.
Hence, the 2-D transform coefficients for expanding f (m, n) with these basis
functions could be computed as follows:
Similar to 1-D transform (seeEq. (2.44)), the foregoing equation can also be expressed in the form of a matrix transformation. Consider the matrix representation of the function f (m, n) as f of dimension M ×N. The transform matrices of basis functions Φ and Ψ are also represented in the following form.
Φ =
Then the 2-D transform of f is given by Eq. (2.158).
F = ΦfΨT. (2.158)
Given the property of orthonormal matrices, the inverse transform can be expressed as
f = ΦHF ΨHT
. (2.159)
For computing the 2-D transforms, we apply 1-D transforms to all rows and to all columns successively in Eq. (2.160).
F = ΦfΨT,
= Φ
ΨfTT
. (2.160)
Definitions of all the transforms discussed in 1-D are trivially extended to 2-D following the framework as presented above. Most of their properties in 1-D also could be extended to 2-D. However, as DCT is used in compression standards, we review some of its properties in 2-D in the following subsection.
2.3.1 2-D Discrete Cosine Transform
In 2-D also there are eight different types of DCT. Out of them, we concen-trate only on the type-I and type-II even 2-D DCTs . For an input sequence x(m, n), m = 0, 1, 2, . . . , M ; n = 0, 1, 2, . . . , N , these are defined as follows:
XI(k, l) = N2.α2(k).α2(l).
XM m=0
XN n=0
(x(m, n) cos(mπk
M ) cos(nπl N )), 0 ≤ k ≤ M, 0 ≤ l ≤ N.
(2.161)
XII(k, l) = N2.α(k).α(l).
M−1X
m=0 N −1X
n=0
(x(m, n) cos((2m + 1)πk
2M )
cos((2n+1)πl2N )), 0 ≤ k ≤ M − 1, 0 ≤ l ≤ N − 1.
(2.162)
The type-I 2-D DCT is defined over (M +1)×(N +1) samples, whereas the type-II 2-D DCT is defined over M × N samples. These can also be derived from the 2-D GDFT of symmetrically extended sequences, as in the 1-D case.
We denote the type-I and the type-II 2-D DCTs of x(m, n) by C1e{x(m, n)}
and C2e{x(m, n)}, respectively.
2.3.1.1 Matrix Representation
A 2-D input sequence {x(m, n), 0 ≤ m ≤ M −1, 0 ≤ n ≤ N −1} is represented by M × N matrix x. Its DCT is expressed in the following form:
X = DCT (x) = CM.x.CNT. (2.163)
2.3.1.2 Subband Approximation of the Type-II DCT
The approximate DCT computation scheme as discussed earlier, exploiting subband relationship of type-II DCT coefficients, can also be directly extended to the 2-D. Let the low–low subband xLL(m, n) of the image x(m, n) be com-puted as
xLL(m, n) = 14{x(2m, 2n) + x(2m + 1, 2n)
+x(2m, 2n + 1) + x(2m + 1, 2n + 1)}, 0 ≤ m, n ≤ N2 − 1.
(2.164) Let XLL(k, l), 0 ≤ k, l ≤N2−1 be the 2D DCT of xLL(m, n). Then the subband approximation of the DCT of x(m, n) is given by [72]
X(k, l) =
2cos(2Nπk) cos(2Nπl)XLL(k, l), k, l = 0, 1, ...,N2 − 1,
0, otherwise. (2.165)
Similarly, the low-pass truncated approximation of the DCT is given by X(k, l) =
2XLL(k, l), k, l = 0, 1, ....,N2 − 1,
0, otherwise. (2.166)
2.3.1.3 Composition and Decomposition of the DCT Blocks in 2-D Following the same analysis as presented for 1-D [67], we express the block composition and decomposition operations in 2-D. Let AL,N be the block composition matrix, which combines L N -point DCT blocks into a block of LN -point DCT. Consider L × M number of N × N-DCT blocks in 2-D. Then its composition into a single LN × MN block is expressed as
X(LN ×MN)= A(L,N ) Similarly, for decomposing a DCT block X(LN ×MN)to L ×M DCT blocks of size N × N each, the following expression is used:
2.3.1.4 Symmetric Convolution and Convolution–Multiplication
Properties for 2-D DCT
Like 1-D, similar convolution–multiplication properties also hold here. In par-ticular, operations involving both type-I and type-II even DCTs are stated below [95].
C2e{x(m, n)zh(m, n)} = C2e{x(m, n)}C1e{h(m, n)}, (2.169) C1e{x(m, n)zh(m, n)} = C2e{x(m, n)}C2e{h(m, n)}. (2.170) Equations (2.169) and (2.170) involve M × N multiplications for performing the convolution operation in the transform domain.
2.3.1.5 Fast DCT Algorithms
As observed earlier, it is not sufficient to propose or design an equivalent algorithm in the compressed domain. We should also consider the merits and demerits of the scheme with respect to alternative spatial domain processing.
A faster DCT computation makes spatial domain processing more attractive as the overhead of forward and inverse transforms gets reduced subsequently.
In this section let us review a few of these efficient approaches for computing
the DCT coefficients. Various algorithms are reported for computing DCT of a sequence efficiently [3, 24, 25, 26, 37, 38, 41, 74, 88, 146, 157]. Out of them, a few demand special architecture for processing the data. Duhamel and Mida [38] provided a theoretical lower bound on the multiplicative complexity of the 1-D N (= 2n)-point DCT as
µ(CN) = 2n+1− n − 2, (2.171) where the length of the DCT is N = 2n, and µ denotes the minimum number of multiplications required in the computation. Loeffler et al. [88] proposed a fast 1-D 8-point DCT algorithm with 11 multiplications and 29 additions using graph transformations and equivalence relations. This algorithm achieves the theoretical lower bound on the required number of multiplications (that is, 11 for 1-D DCT).
In 2-D, algorithms using polynomial transforms [37, 146] are shown to be very efficient. These algorithms require a comparable number of additions and the smallest number of multiplications of all known algorithms, leading to an almost 50% reduction in multiplications required by the conventional fast sep-arable DCT computation. However, polynomial algorithms are more complex for implementation. Cho and Lee [26] introduced a direct 2-D 4 × 4-point DCT algorithm achieving similar computational complexity, but maintaining simple and regular computational structures typical of the direct row–column and the 2-D vector radix DCT algorithms [158]. This algorithm computes the 2-D DCT from four 1-D 4-point DCTs [25]. For determining the number of multiplications and additions they considered applying the fast 1-D DCT algorithm either by Lee [82] or Hou [59]. Thus, their algorithm requires 96 multiplications and 466 additions for the 2-D 8 × 8-point DCT.
The theoretical bound of the multiplicative complexity in 2-D is deter-mined by Feig and Winograd [41]. For a 2-D 2n× 2n-point DCT, it is given as
µ(CN⊗ CN) = 2n(2n+1− n − 2), (2.172) where the length of the DCT is N = 2n and ⊗ is the Kronecker product and CN ⊗ CN represents the 2-D DCT matrix. They proposed an algorithm for computing the 8 × 8 block DCT, which requires 94 multiplications and 454 additions. Subsequently, Wu and Man [157] pointed out that, using the fast 1-D 8-point DCT algorithm [88] in Cho and Lee’s technique for computing 2-D 8 × 8 2-DCT [25], we may achieve the optimal number of multiplications. The resulting number of multiplications is 8 × 11 = 88, which is the same as the multiplicative lower bound of Eq (2.172). The number of additions remains the same as before (466).
However, it is possible to compute even with less number of multiplications if we uses other different operations such as shift, 2’s complement operations, etc. For example, in [5] and [78], two such algorithms are reported that require only 5 multiplications, 29 additions, and a few two’s complement operations (reportedly 16 in [5] and 12 in [78]) for computing an 8-point DCT. Thus,
according to [78], only 80 multiplications, 464 additions, and 192 two’s com-plement operations are required for computing the 2-D 8×8-point coefficients.
The complexities of various 8 × 8 DCT computation algorithms are summa-rized in Table 2.9.
Table 2.9: Complexity comparison of various algorithms for 8 × 8 DCT
Algorithm Year 1-D DCT 2-D DCT
M A M A
Ahmed et al. [3] 1974 64 64 1024 1024
Chen et al. [24] 1977 16 26 224 416
Kamanagar and Rao [74] 1982 128 430
Vetterli [146] 1985 104 474
Arai et al. [5]†† 1988 5 29 80 464
Loeffler et al. [88] 1989 11 29 176 464
Feig and Winograd [41] 1992 94 454
Cho et al. [26] 1993 96 466
Wu and Man [157] 1998 88 466
††It also requires additional two’s complements operations Source: From [147] with permission from the author.
2.3.2 2-D Discrete Wavelet Transform
As discussed for separable transforms, the 2-D DWT also can be implemented by applying the 1-D DWT along the rows and columns of an image. The computational steps are described inFigure 2.8. In this figure, the first level of decomposition or analysis of an input image is shown. As a result of this processing, four sets of wavelet coefficients are obtained. They are referred to as approximation (LL), horizontal (HL), vertical (LH), and diagonal (HH) coefficient subbands, respectively. As explained in Chapter 1, in our notation, the first letter L or H corresponds to the application of a low-pass (L) or high-pass (H) filter to the rows, and the second letter refers to the same application to the columns. After filtering, half the samples are dropped by downsampling with a factor of 2 as is done for a 1-D transform. For the next level of dyadic decomposition, the LL subband is decomposed into four subbands in the same way, and the process could be iterated till the lowest level of resolution of representing the image. For reconstructing the image (also known as synthesis operation), inverse operations are performed in the reverse order. The reconstruction starts from the lowest level of subbands, and from each lower level, the next higher level image representation (the corresponding LL subband) is formed. In this computation, first the subbands are upsampled by a factor of two with zero insertions in a way similar to the 1-D synthesis
Image
Output Image
LL LH
HL HH
1-D DWT along rows
1-D DWT along columns
1-D IDWT along columns
1-D IDWT along rows
L H
L H
Figure 2.8: 2-D DWT and IDWT.
process. The upsampled subbands are then filtered by corresponding synthesis filters and the results are added to obtain the reconstructed image for the next higher level. In this case also, efficient DWT and IDWT computations can be performed by lifting stages for individual 1-D transformation.
2.3.2.1 Computational Complexity
Let us consider the cost of filtering for 1-D DWT with the filters h(n) and g(n). Let the corresponding length of these filters be |h| and |g|, respectively.
For every output sample, the number of multiplications and additions in con-volving an FIR filter of length l are (l + 1) and l, respectively. Hence, for every pair of wavelet subband coefficients ((a(n), d(n))), we require |h| + |g| + 2 mul-tiplications and (|h| + |g|) additions. However, if the filters are symmetric, the number of multiplications gets reduced to (⌈|h|2 ⌉ + ⌈|g|2 ⌉ + 2). Hence, for imple-menting the 5/3 filter, we require 7 multiplications and 8 additions for every two samples. In 2-D, these operations are performed twice for every pixel in the reconstructed image. Thus, the number of per-pixel operations in this case is 7 multiplications and 8 additions (7M + 8A). Similarly, 9/7 filtering can be per-formed with 13 multiplications and 16 additions for every pixel in 2-D. These computations can be performed much faster using lifting schemes. Daubechies discussed in [32] that for even |h| and |g|, the number of operations for 1-D lifting implementation of these filters involve (|h|2 +|g|2 + 2) multiplications and that many number of additions. Typically, 9/7 lifting implementation requires 6 multiplications and 4 additions for every pair of output samples. It implies that, in 2-D, per-pixel computation cost becomes 6M + 8A. However, in 5/3
lifting, multiplication can be performed by right-shift operations. Hence, it requires only 4 additions and 2 shift operations (≈ 4A) for every output pixel in 2-D.
2.4 Summary
Image transforms provide alternative descriptions of images and are useful for image and video compression. There are various techniques for transforming or factorizing an image as a linear combination of orthogonal basis functions.
For a finite discrete image, the number of these discrete basis functions (or basis matrices) is also finite, and transformations are invertible. Some of the properties of these transforms, in particular, properties of discrete Fourier transforms, generalized discrete Fourier transforms, trigonometric transforms, discrete cosine transforms, and discrete wavelet transforms were reviewed in this chapter.