This section describes the preparation of the visual stimuli for the implementation of the experiments of this thesis. The description starts with the selection of an image quality metric and a generic cost function that is required for estimating the theoretical cost for better quality visual cues. Finally, a methodology based on the VDP computational model is described for discarding quality levels that elicit no perceptual visual di↵erence to the viewer.
5.3.1 Quality metric for visual cues
For visuals, resolution was chosen as the variable that modifies quality. This was preferred over other metrics of quality because it is the most straightforward while the visual di↵erences are distinct and easy to understand. A number of standards already exist for resolution and it is possible to be abstracted from the underlying algorithm used for image synthesis. Furthermore, the computational cost of ren- dering an image is, for many rendering algorithms, a linear function of the image resolution.
All the images were computed using path tracing [Kaj86] due to the accuracy and straightforward nature of the computation. The software was implemented from the ground up and no available rendering software was used. The CAD1 models for the experimental studies of this thesis were designed using Autodesk Maya while photos from real scenes were captured author used for designing the models. The machine used for rendering these images was an Intel Xeon E7450 at 2.40 GHz with a total of 24 physical cores and 64 Gigabytes of RAM. Figure 7.1 depicts all the rendered images. Both the models and the rendered images were made specifically for the experimental studies of this thesis.
The objective was to create realistic scenes that present physically correct illumination and material properties, and may be representative of future rendering systems. All the images were rendered to convergence to generalise this work to any possible algorithm. The experimentation framework included the computation of 240 images which varied in resolution from 16⇥9, to the highest resolution at Ultra High Definition (UHD) or 3840⇥2160 using a fixed aspect ratio of 16 : 9 for all the images (i.e. 16⇥9, 32⇥18,· · ·, 3840⇥2160). Figure 5.4 depicts the process of generating all the images. The lowest resolution was chosen to reflect the level at which humans find it difficult to identify images [Tor09].
In order not to introduce any bias due to the presented size, all the di↵erent
Figure 5.4: Images rendered at di↵erent resolution and scaled to the highest resolu- tion displayable by the visual display (3840⇥2160).
quality images were resized to the same resolution, namely, UHD resolution used by the display hardware. This resizing process was implemented using a bi-cubic inter- polation kernel while anti-aliasing and colour dithering filters were used in order to keep the quantisation error to a minimum. Bi-cubic interpolation was preferred over other image scaling methods because it produces smooth images and its application has no significant computational cost relative to the overall rendering costs.
5.3.2 Cost estimation for visual cues
The computation time needed for obtaining an image of the sequence of 240 images can be generally estimated as follows:
CkV ⇡Pk·L, k= 1,2,· · · ,240, (5.8) wherePk= 16·9·k2 is the number of pixels of thek th image and L=C240V /P240
is the time needed for the computation of an individual pixel. It is worth mention- ing that this estimation, strongly depends on the available hardware, the algorithm used and the optimisations applied at an algorithmic and software level. Other pa- rameters that a↵ect the computation time include, but are not limited to, the scene
complexity, material and texture properties. However, in order for the results of the experimental studies E1 and E3 proposed in the methodology to be generalised, the assumption that the computational cost is varying linearly with resolution is made.
Moreover, the problem is decoupled from real time measurement and is con- sidered in terms of normalised cost that can be given by equation 5.8 using the normalisation factor 1
CV
240
. This results in visual levels with costs independent of the underlying algorithm used for the computation. Therefore, in what follows, the termvisual cost is used to denote the quanity given by:
CNV k = ⇣ k 240 ⌘2 , k= 1,2,· · ·,240, (5.9) This formula yields no real cost but a theoretical cost that is used to assess the di↵erent computional requirements of the 240 visual quality levels in a consistent way independent on the underlying algorithm used for rendering the visual cues. 5.3.3 Visual selection criteria
The HVS’s propensity to observe visual di↵erences at lower resolutions has led to consider only those visual stimuli that can be considered perceptually distinguish- able. For the correct implementation of the experimental procedure, only the subset of images that elicit visual di↵erences at perceptual level were needed. Images that elicit the same perceptual response and have di↵erent costs might lead to false con- clusions in the resource allocation framework. In order to obtain a set of perceptually distinguishable images a visual perceptual metric was used.
The latest version of the HDR-VDP computational model (HDR-VDP-2.2) [NMDSLC15] was used for percetually assessing the rendered images. This model is a widely used objective metric for detecting perceptual di↵erences between High Dynamic Range (HDR) or Low Dynamic Range (LDR) image pairs. The model provides the Q correlation measure, a numeric score that ranges between 0 100. LowQ scores indicate apparent visible di↵erences between the input images while two images with high Q value are considered perceptually indistinguishable. For the selection of distinguishable visual stimuli, pairwise comparisons between the highest resolution image (UHD) and the 239 other rescaled images were performed using the LDR mode of the metric. The averagedQ scores for all the six scenarios used in experiment E1 are depicted in Figure 5.5. As expected, the results follow a logarithmic trend indicating that at higher levels participants struggle to find apparent di↵erences.
Figure 5.5: Average values of the Q correlation results obtained from the VDP model for all six scenarios that were considered for theE1 experimental study.
In their study, Varshney and Sun [VS13] show that the internal representation of a stimulus scales in logarithmic fashion for increasing stimulus magnitudes. The authors argue that any range of stimulus physical intensities is mapped through a log-curve to a finite set of perceptual points that consitute distinct sensory levels for the human observer. This mechanism resembles signal quantization where the internal perception points are uniformly spaced and very high intensity values have little or no perceptual impact to the internal representation (upper quantization boundary). This argument is true not only for the visual domain but also for other human sensory systems [VS13].
This fundamental psychophysics result reported by Varshney and Sun was used to discretise theQvalues to 80 discrete levels that elicit noticeable perceptual di↵erences to the user. This number is the minimum possible number of levels in order for the resulting set of images to include the majority of the common resolution standards that are frequently used in High Definition Television (HDTV) and Standard Definition Television (SDTV) applications. These are:
• Quarter High Definition (QHD or 960⇥540)
• High Definition (HD or 1280⇥720)
• Full High Definition (FHD or 1920⇥1080)
5.4
Summary
This chapter explained how the visual stimuli of the experimental studies E1 and
E3 were rendered. The computation of these images was based in path tracing, a widely used algorithmic technique for obtaining global illumination features. Cost estimation for visuals is based on formula 5.9, which for higher image resolution levels (k) yields a higher theoretical cost CNV
k . Finally, visual selection criteria using the VDP model were discussed. These criteria were used to discard those visual stimuli that are perceptually indistinguishable and have di↵erent theoretical costs.