2.5 Teoría de Género
2.5.4 Dimensiones del género
The expression of uncertainty refers to doubt, and in the measurement framework, it stands for doubt about the validity of the measurement results. This measurement uncertainty characterises the dispersion of the values, that could be associated to the
6.4. Robust L∞ Uncertainty propagation 131
quantity being measured (the measurand) [91]. This uncertainty in reality reflects the lack of knowledge of the value of the measurand. More precisely, the uncertainty indicates the upper and the lower values that an uncertain variable may assume; after all systematic biases have been corrected.
Estimating the uncertainty could be not sufficient in some vision applications. One of the most important tasks is to evaluate its propagation through a particular model. To illustrate that, let us consider a system with inputs X = (X1, . . . , Xm)⊤,
and outputs Y = (Y1, . . . , Yn)⊤, where:
Y = f (X), f = (f1, f2, . . . , fm)⊤ (6.4)
where f is the measurement model. Given an estimate x of X, then an estimate of
Y is:
y = f (x) (6.5)
The covariance matrix of the dimension m × m of the output y is:
Λy = u(y1, y1) · · · u(y1, ym) .. . . .. ... u(ym, y1) · · · u(ym, ym) (6.6)
and given by:
Λy = JxΛxJ⊤x (6.7)
where Jx is the input Jacobian matrix, called also the sensitivity matrix of dimension
m × n, and given by:
Jy = ∂f1 ∂X1 · · · ∂f1 ∂Xn .. . . .. ... ∂fm ∂X1 · · · ∂fm ∂Xn (6.8)
The uncertainty of the output given in (6.7) is estimated in fact through a first order Taylor series approximation [50]:
u2y = m X i=1 m−1 X j=1 ∂f ∂Xi ! ∂f ∂Xj ! u (xi, xj) (6.9)
where u (xi, xj) is the covariance of xi and xj, and when i = j, the
q
(u(i, xi)) = uxi is the uncertainty of xi. Equation (6.9) can be written in a more general form:
132 Chapter
6.
Robust L∞ Convex Optimisationwhere Λx is the input covariance matrix, and Jy is the input Jacobian matrix, namely the matrix of partial derivatives. Equation (6.10) is known as the propagation property of the uncertainty through non-linear systems [50].
6.5
Feature location uncertainty
Selecting the most appropriate feature extraction technique for any vision application is not obvious. In the literature, a relatively large variety of feature detectors are usually used in such applications. Interestingly, the final performance of these solutions vary from one feature extraction technique to the other. Similarly to the previous solution, in this chapter we investigate the performance of our solution using the Harris corner detector and the SIFT extractor. Since the detected feature points, regardless the nature of the detector, have some uncertainty, the proposed solution in this chapter to estimate the motion uses the robust convex optimisation scheme based on those uncertainties and their propagations.
We have introduced in Chapter 5 (Section 5.4, page 105) techniques that can be used to estimate the uncertainty in feature positions from the Harris corner detector and from the SIFT extractor. The derivative approach is used as well in this solution, where the covariance matrix is recovered as the inverse of the Hessian matrix.
Implementation of these techniques is conducted on challenging datasets and shown in Figure 6.2 and Figure 6.3. The first one is collected in our laboratory using a Pioneer P3-DX platform with a fully-calibrated forward-looking camera (Section 1.6, Chapter 1, page 8). The second is gathered from a vehicle travelling in an urban city environment, where a forward-pointing calibrated camera is mounted on this vehicle [68]. The third one is a collection of data from a Mars/Moon analogue site at Devon Island, Nunavut [64].
Figure 6.4 and Figure 6.5 show clearly that feature points localisation uncertainties using the Harris corner detector are relatively smaller than those estimated using SIFT for all environments. Results are summarised in Table 6.1 as well. The average error for Harris corner detector in urban environment, for example, is in the order of 0.04 pixels, whereas it reaches 0.15 pixels using the SIFT extractor. In the Moon/Mars analogue environment, and due to its nature, these uncertainties have remarkably increased (0.05 pixels for Harris corner detector and 0.25 pixels for SIFT extractor), which directly affects the subsequent motion estimations. For the indoor environment, the same pattern is recorded. The average errors is 0.03 pixels for the Harris detector and about 0.12 pixels using the SIFT extractor. However, overall, lower errors are noticed here in comparison to the two other environments. This is due to nature of the environment, where features are relatively closer.
6.5. Feature Location Uncertainty 133
(a) Indoor environment
(b) Urban environment
(c) Moon/Mars analogue environment
134 Chapter
6.
Robust L∞ Convex Optimisation(a) Indoor environment
(b) Urban environment
(c) Moon/Mars analogue environment
6.5. Feature Location Uncertainty 135 A v erage erro rs [Pixels] Time-step 0 50 100 150 200 250 300 350 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0.055
Fig. 6.4 Average feature points localisation errors using Harris.
A v e rage error s [Pixels] Time-step 0 50 100 150 200 250 300 350 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Indoor environment
Moon/Mars analogue environment Urban environment
136 Chapter
6.
Robust L∞ Convex OptimisationTable 6.1 Average localisation errors of the extracted feature points using Harris and SIFT.
Average localisation errors [pixels]
Harris SIFT
Indoor environment 0.03 0.12
Urban environment 0.04 0.15
Moon/Mars analogue environment 0.05 0.25
Matching in the Harris corner detector algorithm is performed using the cross correlation between local image patches. This means that only features that correlate most strongly with each other in both directions are accepted. Therefore, as a serious drawback, the matching accuracy and robustness, in this algorithm, is completely depending on the actual transformation between views. On the other hand, the SIFT extractor uses the Euclidean distance between two feature point vectors as a similarity criteria of the two keypoints and uses the nearest neighbour algorithm to match each other, which increases significantly its accuracy. Even the matching using the Harris corner detector can be performed with low time consumption, its accuracy is compromised comparing to the high accuracy and robustness matching that is provided by the SIFT algorithm.
By analysing the obtained results shown in Figure 6.4 and Figure 6.5, for the SIFT extractor and the Harris corner detector in all environments, the latter confirms its ability to provide relatively more stable and conservative uncertainty estimations. The SIFT quality is necessary to discard underestimated uncertainty as in the Harris detector, which could influence on the performance of motion estimations. This, in addition to matching accuracy of the SIFT extractor, justify our deployment of the SIFT extractor along with the covariance intersection of their uncertainties in each RGB channel for our monocular motion estimation algorithm.
In addition, one of the important novelty parts of our solution in this chapter is the estimation of the propagated uncertainties from the feature positions to the rotation matrices and the translation vectors and to the 3D scene points. In the following two sections, we introduce the techniques used to estimate these propagated uncertainties.
The uncertainties in the rotation matrices and in the translation vectors are estimated by propagating the feature position uncertainties through the eight-point algorithm and the singular value decomposition (SVD) algorithm. These new uncertainties in the rotations and translations, in addition to the original uncertainties in feature positions, are propagated even more to the 3D scene points through the triangulation algorithm.