2. Software EcosimPro
2.4. Descripción de la librería de energías renovables y sus componentes
A few situations that might arise in the imaging process are not covered by the generative model, and we conclude this chapter by considering these.
The biggest shortfall of the imaging model used so far is that many imaging conditions cannot be described adequately using the planar projective transform. However, many other transforms can be described as locally affine or locally pro- jective, so this is not as restrictive as it might appear. For very non-rigid scenes, and those with multiple motions, it is possible to use optic flow, but the size of the latent registration space increases considerably, because motion and occlusion infor- mation for each pixel needs to be stored [44]. However, the registration itself can be accounted for in the construction of the W(k) matrices if desired, and while the estimation of the registration space becomes correspondingly harder, the treatment of the other parts of super-resolution, such as the photometric registration and the high-resolution image priors discussed above need not change much.
Image datasets captured using modern digital cameras are likely to have had a significant amount of pre-processing applied before being saved out to image files.
These include white balance/gamma correction, adjustment for optimal dynamic range, and lossy compression in order for the images to fit more efficiently into the memory media.
Lossy compression is a problem because methods like the JPEG algorithm de- stroy exactly the high-frequency information in the input images that we want to use in super-resolution image reconstruction. The compression algorithm works by taking 8×8-pixel blocks in the image, finding theDiscrete Cosine Transform(DCT) of each block, and quantizing the 64 coefficients. The granularity of the quantization depends on how sensitive we as humans are to each spatial frequency, so that fre- quencies we are less aware of are quantized more coarsely, therefore requiring fewer bits to store.
Figure 3.20 shows a selection of images from the Frog and Keble synthetic datasets which have been saved as lossy JPEG images with various quality levels using Matlab’s imwrite routine.
For each JPEG quality level (from 5% to 100% in increments of 5%, giving 20 datasets per ground truth image), the standard deviation of the pixel-wise additive error induced by the compression artifacts was calculated. Control image sets were created using i.i.d. Gaussian image noise at each of these standard deviations. For each dataset, the best reconstruction was found (across all settings of the Huber prior strength, with α = 0.05), and these are plotted in Figure 3.21. While the errors are comparable for the high quality (less lossy) JPEG images, as the quality decreases, the JPEG image reconstructions are significantly worse than the noise model would predict based oni.i.d. Gaussian noise of the same standard deviation. It is interesting to note that the optimal prior ratio for each of the four cases follow approximately the same curve, even though the minimal errors for each of
image 2 image 1 image 2 70% image 1 40% 10%
Figure 3.20: The “Frog” and “Keble” datasets. The first two images of the Frog and Keble datasets, saved as lossy JPEG images with quality levels of 10% (poorest quality of those shown), 40% and 70% (best quality of those shown) respectively. For ground truth images, see Figures 3.4 and 3.17.
0 20 40 60 80 100 5 10 15 20 25 30 35 40 45 50 55 Error vs Compression
JPEG compression ratio
RMSE wrt ground truth
Frog JPEG Keble JPEG Frog control Keble control 0 20 40 60 80 100 −10 −9 −8 −7 −6 −5 −4 −3 −2
Prior Strength vs Compression
JPEG compression ratio
log 1 0 ( ν / β ) Frog JPEG Keble JPEG Frog control Keble control
Figure 3.21: The underperformance of super-resolution images on JPEG- compressed image data. The solid curves on the left-hand plot show the errors achieved using JPEG input images as the compression ratio is changed. The dotted curves indicate the errors achieved using datasets with equivalent i.i.d. Gaussian noise (and hence obey the forward model), which in all cases are lower. The right- hand plot shows the strength ratio log10(ν/β) for each of the four groups of datasets. While the quality of the image results vary considerably, the prior strengths all match each other remarkably well.
(d) control (75%)
(c) from JPEG (75%)
(b) control (75%)
(a) from JPEG (75%)
Figure 3.22: Reconstructing images corrupted with non-Gaussian noise.
(a) the Frog image, reconstructed using 16 images from the Frog dataset which had been saved as JPEG images at a quality setting of 75%. (b) The corresponding Frog image reconstruction where Gaussian noise of the same standard deviation is added instead; the image quality is significantly better, especially on the tree surface and the specularity in the frog’s eye. (c) Keble image, reconstructed using 16 images from the Keble dataset which had been saved as JPEG images at a quality setting of 75%. (d) Corresponding Keble image reconstruction, using Gaussian i.i.d. noise of the same standard deviation. Again, this is significantly better than the JPEG image case, in particular in capturing the window leading and the brick texture.
the four groups of datasets are very different. Finally, Figure 3.22 shows the super- resolution images from the JPEG and control sets at the 75% quality level, in order for the image quality to be inspected visually. In both cases, even at this relatively high quality level, the details missing from the JPEG case but preserved in the Gaussian nose case are clearly visible.
These observations are important because the same type of image compression used in JPEG images also forms part of the MPEG video compression algorithm, and many video datasets are compressed this way; if the model were not capable of handling such an input source, this would indeed be a problem. It is also possible to extend the model to include the degradation due to the JPEG compression in the forward model, but the quantization is a nonlinear operation, so this prevents us representing the low-resolution inputs as linear functions of the super-resolution image we want to recover, and makes the system less easy to solve. In the plots of Figure 3.21, we have seen that the changed noise type does not greatly change the behaviour of the rest of the model with respect to the corresponding best prior strength setting, so even when using JPEG-compressed images, we chose to accept the limitations of the generative model as described in Section 3.1.