This thesis have dealt in the first place with video coding, based on motion-compensated wavelet transform, but also by using the well-known standard H.264. Then, transmissions of video over noisy channels have been studied, especially coding techniques as multiple description coding and distributed video coding.
8.1 Video coding 8.1.1 Contributions
In the general framework of video coding, motion-compensated based video coding has first been explored. Some improvements of a wavelet-based video coder have been proposed. This coder is fully scalable, and based on a lifted motion-compensated wavelet transform. The bit-rate al-location between the wavelet subbands uses an optimal algorithm which requires the knowledge of the rate-distortion curves of each subband. The JPEG2000/EBCOT coder is used for the coding of the spatio-temporal sub-bands, in order to be JPEG2000-compliant.
More precisely, the main improvement concerns the motion vectors en-coding. The cost of the motion information could become too much signif-icant in the total bit-rate, especially at low and very-low bit-rates. In this thesis, a method to optimally allocate resources between motion information and wavelet subbands in the rate-distortion sense has been proposed. To this way, motion vectors of high precision have been quantized with losses in a scalable way. A uniform scalar quantization, performed in open-loop, is used. Indeed, in order to preserve good properties for the high frequency subbands, full precision motion vectors must be used at the encoder side for the motion compensation and the computation of the wavelet transform.
The quantization step controls the rate-distortion trade-off of the motion vectors.
In order to evaluate the impact of this lossy coding of the motion in-formation, a theoretical distortion model of the motion coding error has been established. This model is simply function of the frame powers and
151
of the frame autocorrelation functions. The subbands quantization noise has been included to this model, which has been generalized at several lev-els of temporal decomposition. This distortion model allows to perform a model-based bit-rate allocation between the motion vectors and the wavelet subbands, in order to optimally dispatch the resources in the total bit-rate.
This approach allows to improve the whole coder performances, especially at low bit-rates.
Some improvements of the lifting scheme have also been done. The in-fluence of some badly estimated motion vectors on the motion-compensated wavelet transform has been minimized, by proposing a novel and adaptive method for the implementation of the lifting scheme. Indeed, the lifting steps have been closely adapted to the motion vectors norm. This method allows to increase the coder performances, and, moreover, no loss in bit-rate is carried out, since all the useful data, i.e. the motion vectors, is already transmitted to the decoder side.
In the framework of an industrial contract with the French national Tele-com operator, Orange labs, the approach of lossy coding of motion vectors has been applied to the actual video coding standard H.264. A new H.264 coding mode has thus been defined. Indeed, a more flexible motion cod-ing could improve the general performances. Some problems regardcod-ing the choice of the quantization step and the encoding of quantized MVs have been solved, and also about the high precision and the prediction of the motion vectors, especially in the 8x8 case. The new coding mode allows to improve the performances compared to the classical modes of H.264.
8.1.2 Perspectives
Of course, this work opens a lot of perspectives. First, in terms of general performances, a more efficient motion estimator could widely improve the wavelet-based video coder, for example with a motion described by blocks of variable-length. However, the JPEG2000 compatibility would be more difficult to reach.
For further researches on the motion-adapted weighted lifting scheme, it could be possible to extend the proposed approach to other criteria, as, for example, the correlation between the motion vectors.
Some improvements can also be performed for the new coding mode inte-grated in the H.264 standard. Further improvements are expected when the proposed technique will be extended to cover the cases where motion infor-mation is even more important, as it happens when sub-blocks of 4x4 pixels are enabled. It could also be interesting to perform the motion estimation on a real grid, that is to say a grid not anymore fixed at 1/4-pel or 1/8-pel precisions. In the same time, a gain in performances could be achieved if any arbitrary real value can be chosen as MV quantization step. As already said, the Lloyd-Max algorithm to represent the quantization step values could be
8.2. Transmissions over noisy channels 153 used. A relationship could maybe be found between the optimal quantiza-tion step for the moquantiza-tion vectors and the current quantizaquantiza-tion step for the coefficients. Different strategies for finding the best motion quantization step could be tested, more efficient than “Oracle” or “Minsum”, as the use of quad-trees to divide the MB. Moreover, the open loop structure could be used to implement an efficient MV-scalable video coder. Finally, the last version of H.264, KTA, has to be used to obtain better performances.
8.2 Transmissions over noisy channels
8.2.1 Contributions
Video transmissions over noisy channels have been studied in the frame-work of the national research project ESSOR. In particular, in the general framework of multiple description coding, two approaches of optimal de-coding have been proposed. The MD coder considered here is based on the motion-compensated wavelet transform coder presented in the first part of this work. Redundancy is introduced before quantization, and at the encoder side, a bit allocation based on the characteristics of the channel produces the balanced descriptions. The descriptions are then transmitted over noisy channels. In order to optimally decode the central description, two approaches have been implemented: the first one tries to estimate the two generated descriptions from the received channel outputs, whereas the second one focuses on the direct estimation of the source from the two noisy descriptions. These approaches are based on the knowledge of the density of the source and on the noise probabilities. Even with an important channel noise, the quality of the central descriptions reconstructed by both of the approaches are close to the one of the original signal.
In the framework of distributed video coding, in collaboration with other French laboratories, an improvement of the actual methods of frame inter-polation has been performed, in order to increase the quality of the side information, and thus of the coding performances. The proposed method is based on a bidirectional motion compensation using pixelwise estimation and allowing overlapped motion vectors. It allows to better improve the quality of the side information compared to some state-of-the-art methods.
8.2.2 Perspectives
With the increase of multimedia communications, transmissions over noisy channels will be of great interest in the next years. Thus, to increase the compression efficiency of the proposed MDC scheme, a product code can be used to code the descriptions in a fixed-length way. Then, after indexation, an entropy-coding step may be considered. Dependence between the coef-ficients would be to consider for the MAP algorithms. Iterative decoding
techniques, such as those presented in [LWK06], may then be applied. In order to reduce the errors on the position of the vectors of the product code, transcoding of these vectors could be used. Finally, the noise corrupting the motion vectors has also to be taken into account. Of course, it could be interesting to introduce in the MD coder the method of lossy coding of the motion vectors previously proposed.
For what concerns the proposed frame interpolation in the framework of DVC, it could be interesting to implement shot detection methods in order to improve the performance of the interpolation method.
For further works, it would be interesting to integrate the proposed MDC scheme in a general DVC scheme, in order to make it robust to channel fail-ures.
Appendix A