Parte I. Capítulo 3.
3.1 La selección de sistemas urbanos como ensayos empíricos
3.1.2 La selección de la segunda variable: el tiempo
To avoid running BA unnecessarily often, we use the linear camera pose estimation tech- nique proposed by Jiang et al. [JCT13]. This linear (and therefore fast) algorithm solves for global camera rotations and positions in a least squares sense based on pairwise cam- era rotation constraints and triplet camera translation constraints. Since it is not available as an out-of-the-box implementation, we describe it here briefly.
Input and Output
For solving for the global camera rotations, relative rotations for pairs of camera poses have to be supplied so that all poses are connected transitively by such constraints.
For solving for the global camera positions, constraints for triplets of camera poses have to be given which contain the ratios of the distances between the camera positions in the triplet and the direction vectors from each camera’s position to the others (see Figure 4.3).
It should be clear that both rotation and translation constrains can be extracted from a BA reconstruction with at least three camera poses. According to [JCT13], an (over-)de-
4.3 Method
Figure 4.4: Illustration of the variables used for recovering globally consistent camera positions.
termined problem can be obtained when all camera poses are connected transitively by triplets which have at least two camera poses in common (to transport scale information throughout the whole optimization problem). In fact, we use all possible camera pose pairs / triplets from our BA reconstructions to generate constraints for this linear method.
The result of Jiang et al.’s method is a globally consistent rotation matrix Ri and a
translation vector tiper camera pose, optimized in a least squares sense.
Recovering Rotation
For recovering globally consistent rotations, the method of Martinec and Pajdla [MP07] is employed: Given the relative rotations Rij, the following must hold for every column
vector rk
i of a resulting global rotation matrix Ri = [r1ir2ir3i]:
rkj − Rijrki = 0 (4.1)
By concatenating all such constraints from all the camera pose pairs given as input, one can obtain an overdetermined linear system of equations which can be solved for all ri
according to Section 2.3.1. Since orthonormality of the recovered Ri matrices is not
enforced in the linear equations, the resulting matrices have to be projected to the space of orthonormal matrices. This can be done in a least squares sense by using a SVD as shown in Equation 3.7.
Recovering Translation
Given the distance ratios and direction vectors between a triplet of camera poses as shown in Figure 4.3, we can express one of the three camera translation vectors tkbased on the
other two ti, tjlike illustrated in Figure 4.4 as:
tk ≈ 1 2 ( (ti+ Ri(θi0)sikij(tj − ti)) + (tj+ Rj(−θ0j)s jk ij(ti− tj)) ) (4.2) Here, Ri() is a rotation matrix around the axis cij × cik and can be obtained from the
Figure 4.5: Example for partially rotated camera pose reconstruction because of undercon-
strained linking of scene parts. The blue and gray parts can rotate against each other around the weak link’s axis.
tjkare free from error, one can define the following three linear constraints:
2tk− ti− tj = Ri(θi0)s ik ij(tj− ti)) + Rj(−θ0j)s jk ij(ti− tj)) (4.3) 2tj− ti− tk = Ri(−θi0)s ij ik(tk− ti)) + Rk(θk0)s jk ik(ti− tk)) (4.4) 2ti− tj− tk = Rj(θ0j)s ij jk(tk− tj)) + Rk(−θk0)s ik jk(tj− tk)) (4.5)
Given the constraints, a linear system At = 0 can be set up where t is the concatenation of all camera translations and A contains all coefficients from the three linear equations for all camera pose triplets. The solution can be obtained by calculating the eigenvector associated to the fourth smallest eigenvalue of ATA. The eigenvectors corresponding to
the three zero eigenvalues correspond to the three degrees of freedom from the origin of the coordinate system.
Issues
Beside several ambiguities described in [JCT13], we observed two issues with the algo- rithm which needed special treatment:
Partial Rotation. The scene reconstruction may be partially rotated. This happens when there are two subsets of camera poses to be obtained and all the triplets that overlap with both of the subsets contain the same ”weak” link. As illustrated in Figure 4.5, false reconstructions are possible in this case which satisfy all the angular and ratio constraints applied.
Such configurations must be avoided, e.g. by using only BA reconstructions with at least four camera poses as input for the linear camera pose estimation and extract all possible triplets from them.
4.3 Method
Numerical Instabilities. Due to inaccuracies in the input and numerical instabilities, small eigenvalues of the matrix ATA may switch which leads to selecting a wrong eigen-
vector as solution to the camera translation problem.
To compensate for that, we take the eigenvectors corresponding to a range of eigen- values and re-check the constraints (Equations 4.3, 4.4 and 4.5). Finally, we pick the eigenvector that fits the constraints best.