4 MARCO DE REFERENCIA
4.1 MARCO TEÓRICO
4.1.5 JUEGOS TRADICIONALES
The thesis outline is illustrated in Figure 1.6. This thesis can be divided into two main parts: multiple view geometry problems and convex optimisation as a solution tool. The connection between these parts is the main objective of this thesis which is to robustly and optimally estimate the motion. In addition, the structure of the thesis is designed in a progressive-research presentation, where a developed solution in a later chapter is based on the previous solution presented in a previous chapter. This is shown using red arrows in Figure 1.6.
Our scenario consist of using monocular vision systems, where a vehicle equipped with single camera takes a sequence of images as it moves. We wish to estimate the
1.9. Summary and Outline of the Thesis 17
camera rotations and translations relying exclusively on visual inputs. In addition to image analysis, another challenge is added to our scenario consisting on the scale ambiguity estimation due to the projection effects.
In the next two chapters, we will present an overview of the theoretical basics which are relevant for the thesis. Particularly, Chapter 2 introduces the basic ideas of multiple view geometry, focussing on camera geometry and projective camera model. We show the mathematical background of the multiple-view geometry problems. Attention is given to the fundamental and essential matrices estimation problems, especially their mathematical formulation problem. Three-view geometry is considered as well in this chapter, in which a scenery is sequentially captured by a moving camera from three different positions.
In Chapter 3, we provide details on convex optimisation, robust convex optimi- sation and quasi-convex optimisation, along with some notions about least-squares optimisation, linear optimisation, iterative optimisation, robust and recursive filtering. Furthermore, we introduce convex optimisation tools, such as the second-order cone programming (SOCP). Details on global optimisation methods (Branch and bound in particular) are given as well. This chapter can be considered as a preamble for the deployment of optimisation in motion estimation.
Using tools introduced in Chapter 2 and Chapter 3, in Chapter 4 we present first contribution in this thesis. Using monocular systems makes the motion estimation challenging due to the absolute scale ambiguity caused by the projective effects. For this, we use robust tools to estimate both the trajectory of a moving object and the unknown absolute scale ratio between consecutive image pairs. Thus, the novel approach presented in this chapter consists of a two-stage solution used to efficiently solve the monocular visual odometry problem. The first stage of which pertains using convex optimisation with the L∞ norm in motion estimation. For the second
solution, we propose to use two new methods such as the recursive least squares (RLS) algorithm and more robust one such as the H∞ filter to solve the scaling estimation problem. Both techniques follow on nicely from the first one and capable of dealing with system ambiguities in frame to frame absolute scale estimation. The proposed solution uses as input only images provided by a single camera mounted on the roof of a ground vehicle.
Typically, the camera pose is recovered from the available corresponding points between two or more views and the camera calibration parameters. These correspon- dences lead to estimate the fundamental matrix, which is the key for any motion estimation. Generally, the fundamental matrix is the critical link that represents the vision geometry between two views in the pinhole camera model. Thus, in Chapter
18 Chapter
1.
Introduction Chapter 1 Introduction Chapter 2 Chapter 3 Convex Optimisation Image Geometry Chapter 4Convex Optimisation and the H∞
Filtering for Motion Estimation
(CAIP 2013)
Chapter 5
Robust Motion Estimation Using Covariance Intersection
(MED 2014)
Chapter 6
Robust Convex Optimisation for Motion Estimation
(Robotica Jour. 2014)
Chapter 7
Loop Closure Detection For Motion Estimation
(IROS. 2014)
Chapter 8
Robust Convex Pose-Graph Optimisation for Motion Estimation
(Aerospace Engin. Jour. 2014)
Chapter 9
Robust L∞ Cooperative Motion Estimation
Chapter 10
Discussion and Conclusion
Page: 1 Page: 23 Page: 49 Page 79: Page: 101 Page: 125 Page: 159 Page: 193 Page: 219 Page: 269
(Jour. Field Robotics. 2015)
1.9. Summary and Outline of the Thesis 19
matrix. In most vision applications, colour images are converted first to grey-level images, leading to a serious loss of information. In our solution, however, each RGB channel of colour images is processed separately. Then a fusion mechanism is employed to combine the information. After having estimated the uncertainties in features locations in each channel, covariance intersection is used as a fusion solution, resulting on considerable decrease in the measurement errors, which leads in turn to more accurate estimation of the fundamental matrix.
Exploiting the uncertainty estimation techniques from Chapter 5, and convex op- timisation tools in Chapter 4, in Chapter 6 we present a robust convex optimisation solution for monocular motion estimation systems. Critical implementations in the computer vision systems are based on robust features extraction, matching and track- ing. Due to their extraction techniques, image feature locations accuracy is heavily dependent on the variation in intensity within their neighbourhoods, from which their uncertainties are estimated. In this chapter, we incorporate the uncertainty estimation in feature positions via the SIFT and the Harris derivative approaches along with their propagation.
In practice, for any navigation system, errors in position estimates are continuously growing due to the integration of noisy measurements over time and imperfect computational techniques. This unavoidable drift in motion estimation, due to inherent inaccuracy of the devices as well, needs to be corrected. Thus, providing additional correction tools would have a crucial impact on the final estimates of the navigation solution. Indeed, after long navigation into an unknown environment, detecting that the vehicle has returned to a previously visited location offers the opportunity to correct and to increase the accuracy and the consistency of the vehicle motion estimates. In computer vision, this is known as detecting loop- closures. In Chapter 7, we present a novel appearance-based technique for visual loop-closure detection. The widely used techniques based on the Bag-of-Words image representation have shown some limitations, especially with the perceptual aliasing problem. Our solution, however, uses both local invariant and colour features. Moreover, the proposed solution combines Gaussian mixture modelling (GMM) with the KD-tree data structure. In doing so, this solution takes advantage of the robustness of the KD-tree data structure and the efficiency of the Gaussian mixture modelling representation. Experimental validation using datasets from different environments has been conducted. We show that due to their efficiency and complementarity, a combination of KD-trees with GMM could be an alternative for real-time loop-closure detection for mobile robots navigation.
In Chapter 8, we present a new robust convex pose-graph optimisation solution for UAVs monocular motion estimation systems. Pose-graph formulation is an
20 Chapter
1.
Introduction intuitive way to address the pose estimation problem. The nodes of the graph represent the vehicle’s poses and the edges encodes measurements that constrain the connected poses. Solving the pose-graph problem involves finding the optimal configuration of the nodes that best satisfies the constraints. Most methods in the literature utilise standard approaches, like the Gauss-Newton or Levenberg- Marquardt algorithms. However, with these methods there is no guarantee of convergence to the global minimum. Furthermore, they could lead to an infeasible solution. As such, these methods are also very dependent on good initialisations. Alternatively, the proposed solution recovers the optimal position configuration by using convex optimisation through the adoption of more robust norms such as the L∞norm. Furthermore, uncertainty estimations, based on the SIFT derivative approach and their propagation through multi-view geometry algorithms are included in this solution. Once a loop-closure is detected using technique presented in Chapter 7, convex pose-graph optimisation solution performs the correction of any drift occurred during the monocular motion estimation.
After developing robust solutions for visual navigation systems, in which an autonomous vehicle can estimate its own localisation independently, a need for a cooperative solutions has been risen. Thus, Chapter 9 deals with cooperative navigation using convex optimisation. In this chapter, a system for cooperative monocular visual motion estimation with multiple aerial vehicles is proposed. The distributed system between vehicles allows efficient processing in both computational time and estimates accuracy. The global cooperative motion estimation employs state- of-the-art approaches for optimisation, individual motion estimation and registration. Three-view geometry algorithms in a convex optimisation framework are deployed on board the monocular vision system for each vehicle. In addition, vehicle-to-vehicle relative pose estimation is performed with a novel robust registration solution in a global optimisation framework. In parallel, and as a complementary solution for the relative pose, a robust non-linear H∞ filter is designed as well to fuse measurements from the UAVs’ on-board inertial sensors with the visual estimates.
This thesis is concluded in Chapter 10 where we summarise the obtained results and achievements, and finally point out future directions.
In addition, this thesis includes three appendices. Appendix A presents a review on some multiple-view geometry problems that can be solved using convex optimisa- tion. Particular attention is given to problems of triangulation estimation, camera resectioning and homography. These tasks were implemented in this thesis using the second-order cone programming (SOCP) on benchmark datasets, familiar to the computer vision community.