Posproducción - ELABORACIÓN DEL MATERIAL AUDIOVISUAL

CAPITULO 5: ELABORACIÓN DEL MATERIAL AUDIOVISUAL

5.3 Posproducción

3.2. Approximation using the bilateral grid

(a) input signal (b) grid created from signal (c) filtered grid (d) filtered signal

create process slice

Figure 3.2:Illustration of 1D bilateral filtering using the bilateral grid: the signal (a) is embedded in the grid (b), which is processed (c) and sliced to obtain the filtered signal (d). Please seeSection 3.2.1 for details. Adapted fromChen et al.(2007).

3.2.2. Extending the bilateral grid to the dual-cross-bilateral grid

Chen et al. (2007) show that the bilateral grid can also be used for cross-bilateral Cross-bilateral filtering using the bilateral grid filtering. This is achieved by using different images for indexing into and storing

values in the grid: the edge image E(x,y) determines grid coordinates and the other image I(x,y)determines the stored values to be filtered:

Γ x

s_s

, y

s_s

E(x,y) s_r

+= (I(x,y), 1). (3.6) The grid processing remains the same, and the slicing stage accesses the grid accordingly at (x/ss,y/ss,E(x,y)/sr).

Recall that thedual-cross-bilateralcost aggregation smoothes the cost space while The DCB grid preserving edges in both stereo half-images. To implement this using the bilateral

grid, it needs to be extended to take into account both input images as edge images when calculating grid coordinates, and to accumulate cost space values instead of pixel values. This extension is called thedual-cross-bilateral (DCB)grid.

For a pixelp= (x,y)in the left image, and its corresponding pixelp= (x−^d,^y)in DCB grid creation the right image, the DCB grid for a disparity dis created using

Γd

x σ_s

y σ_s

L^?_L(p) σ_r

L^?_R(p) σ_r

+= (C(p,d), 1), (3.7) where the subscriptsLandRindicate the left and right images, respectively.

Instead of image intensities, as per Chen et al., this formulation uses the lightness Limitation to lightness componentL^?of the CIELAB colour space which is perceptually more uniform and

hence more closely models how humans perceive greyscale images. However, this also degrades accuracy compared to a full-colour approach such as the full-kernel DCB aggregation. The trade-off between the number of colour dimensions and the corresponding memory requirements for the bilateral grid is discussed in more detail in the next section.

Finally, the result of slicing the DCB grid is the aggregated cost DCB grid slicing C⁰(p,d) =Γ_d

x σs, y

σs,L^?_L(p)

σr ,L^?_R(p) σr

. (3.8)

For each disparityd, the corresponding DCB gridΓ_dis a four-dimensional array of Implementation

two floating-point numbers. To efficiently implement all grid processing operations, each grid is first flattened into two dimensions. All flattened grids are then tiled into a single2D texture to have fast random access to the grid data and to exploit texture filtering hardware. This is crucial to efficiently perform the grid slicing step.

The bilateral grid downsamples each dimension by a factorσ. For a valuev∈[0,x], Grid flattening

the downsampled range of values are therefore the integers 0, 1, . . . ,[x/σ], a total of [x/σ]+1 values. An image of size w×^h and with lightness L^?∈[0, 100] then has a DCB grid of the dimensions d_w×^dh×^dc×^dc, where d_w= [(w−¹)/σ_s]+1, d_h= [(h−¹)/σs]+1 anddc= [100/σc]+1. As illustrated inFigure3.3, the4D DCB grid is then split into a2D layout ofd_c×^dccomponents of sized_w×^dheach, resulting in a flattened size of(d_w·^dc)×(d_h·^dc).

... ... ... ... ...

· · ·

...

d_ccomponents

dccomponents

Flattened grid

... ... ... ... ... ... ...

· · ·

...

d_wpixels

dhpixels

One component

Figure 3.3:Illustration of flattening the four-dimensional DCB grid into two dimensions.

The quadrilinear interpolation of the slicing stage can be efficiently implemented Quadrilinear

interpolation by using hardware-accelerated bilinear texture filtering to fetch the values stored at the four surroundingd_w×^dhcomponents. Within each component, bilinear texture reads are used, and the four resulting values are bilinearly interpolated.

All the flattened grids, one for each disparity, are then tiled into a2D texture of Grid tiling

float2s. Given a particular number of grids, a rectangular layout of grids is looked for as it makes optimal use of graphics memory without wasting memory. To find a rectangular tiling, all combinations of numbers of rows and columns are examined, starting from a square configuration, and stopping at the first tiling that fits into the maximum texture size of8192×⁸¹⁹²pixels. In the case that no valid rectangular tiling exists (see example below), the grids are laid out in reading order instead, fitting as many in horizontally as possible. Note that this may result in texture space being allocated which goes unused.

TheTeddystereo image has a resolution of 450×375 pixels and60disparity levels.

Example

Assuming default parameters ofσs=10 andσr=10, each DCB grid has a size of 46×³⁸×¹¹×^{11, or}(46·¹¹)×(38·¹¹) = 506×418 when flattened. A rectangular tiling of 10×6 exists, resulting in a (6·⁵⁰⁶)×(10·⁴¹⁸) = 3030×4180 texture.

However, for 61disparities, no rectangular tiling exists, and a layout with 16×⁴ grids is allocated, of which the last three are not used (which wastes about5MB).

3.2. Approximation using the bilateral grid

3.2.3. Regaining accuracy using a dichromatic approach

The dramatic speedup achieved by the DCB grid comes at some loss of quality. Trading off speed for accuracy This is because the underlying bilateral grid only works on greyscale images and

hence does not differentiate colours that have similar greyscale values, as shown in the examples ofFigure3.4.

Colour discriminability can be increased by adding additional colour axes to the Memory requirements grid. Unfortunately, the memory requirements of the bilateral grid are exponential

in the number of dimensions. TheTeddyandConesstereo images, for example, each have a total memory footprint of

60 disparities× ⁴⁵⁰₁₀ ×³⁷⁵₁₀ × 100

10 _k

×^{8 bytes} ⁽³^.⁹⁾ when using the standard parametersσs=10 andσr=10,k total colour dimensions, and two single-precision floating-point numbers per grid cell. For the greyscale DCB grid, where k=2, this amounts to 78MB. However, the best results, with full CIELAB colours in both images (k=6), would require a prohibitive764GB.

Given this constraint, a total of at most k=3 colour dimensions in both images Dichromatic approach can be afforded on current generation graphics cards, resulting in783MB for the

Teddy stereo image. This allows one additional colour axis in one of the stereo half-images, in addition to each image’s greyscale lightness component. The result is a dichromatictechnique which can differentiate colours along two colour axes in one of the two images. This is an interesting trade-off between the common monochromatic and anthropocentric trichromatic stereo approaches, that has not previously been explored.

input image DCB grid input image Dichromatic DCB grid

greyscale greyscale + hue

TsukubaCones

Figure 3.4:Comparison of the mono- and dichromatic DCB grid. The input images are displayed as ‘seen’

by the algorithms. Note that the disparity map of the dichromatic DCB grid visibly improves on the monochromatic DCB grid.

A range of additional colour dimensions are evaluated in Table 3.1. The table Comparison of

colour dimensions compares the following seven candidate colour dimension: HSL hue and saturation, CIELAB chromaticitiesa^? andb^?, and the derived properties hueh_ab, saturations_ab and chromaC^?_ab. The ‘Rank’ column shows the ranking each particular candidate would have achieved in the Middlebury stereo benchmark (Section2.4.5). The best technique, in terms of lowest average rank, is CIELAB hueh_ab.

In follow-up work to mine,Zhu(2011) uses principal component analysis (PCA) to Principal

components find the first two principal colour components of a stereo image instead of using the lightnessL^? and hueh_abas proposed in this chapter, which reportedly improves accuracy even further.

Technique Rank _nonocc^Tsukuba_all _disc _nonocc^Venus_all _disc _nonocc^Teddy_all _disc _nonocc^Cones_all _disc

h_ab=atan2(b^?,a^?) 48.6 4.28 5.44 14.1 1.20 1.80 9.69 9.52 16.4 19.5 4.05 10.4 10.3 HSL saturation 49.0 4.44 5.37 12.9 1.05 1.58 8.29 9.46 16.4 19.4 4.30 10.7 11.3 C^?_ab=^pa^?2+b^?² 49.9 4.97 5.94 16.7 1.15 1.75 8.65 9.55 16.4 19.9 4.00 10.410.5 s_ab=C_ab^?/L^? 50.0 4.36 5.45 12.9 1.19 1.86 9.32 9.41 16.3 19.2 4.41 10.8 11.6 b^? 50.8 4.79 5.83 16.2 1.25 1.84 10.10 9.53 16.319.6 4.28 10.7 11.6 a^? 52.0 5.36 6.49 18.3 1.24 1.84 9.13 9.62 16.5 19.9 4.28 10.5 11.3 HSL hue 51.2 4.62 5.85 14.9 1.30 1.87 10.40 9.83 16.6 20.2 4.18 10.7 11.1

Table 3.1:Accuracy comparison of the dichromatic DCB grid with various colour properties using the Middlebury stereo benchmark (Section 2.4.5), to 3 significant digits.

In document Producción de un (1) documental de televisión respecto a la historia de Víctor Hugo Arias Benavides, alcalde del cantón Zamora en el periodo 1978 - 1983, durante el retorno a la vida democrática del Ecuador (página 70-92)