CAPITULO IV RECOLECCION Y ANALISIS DE INFORMACION
4.1. DIAGNOSTICO
4.2.7. ACTIVIDAD NUMERO 1
This section focuses on the geometry present in multiple view camera systems. The projection properties of the pinhole camera model, derived in Section2.1.1, result in certain geometric constraints for corresponding points across viewpoints. Different views are obtained from sequential frames from a single camera, or alternatively, captured simultaneously from a set of stereo cameras. Without a loss of generality, this section will only focus on the stereo camera setup as derivations can be used verbatim for the single camera case.
CHAPTER 2. CAMERA GEOMETRY 18
2.2.1
Epipolar Geometry
Epipolar geometry describes the projective geometry across two camera views [62, p. 239]. Figure2.3
illustrates a 3D point, xc, projected to two image points, xL and xR, by means of a stereo pair depicted by their camera centres, cL and cR . The straight line joining the two camera centres is known as the baseline, B. The camera centres, image coordinates and xc are coplanar to what is known as the epipolar plane. This plane, denoted by π, introduces constraints to the location of corresponding points in different viewpoints. Suppose for the stereo rig in Figure2.3that the coordinates of a single image point, xL, are
xL cL B cR xR xc lL lR π
Figure 2.3: Epipolar geometry of stereo pair with camera centres indicated by cLand cR. The 3D point,
xc, corresponding image points, xLand xR, and camera centres lie in a common plane, π. The projections of rays from cLand cRto xc, on the left and right image planes, are given as lLand lR respectively. Adapted from Hartley and Zisserman [62, Figure 9.1].
given but xc and xR are unknown. The baseline, and ray from cLto xL determine the epipolar plane, π. The 3D point, xc, must lie along this ray and the corresponding ray from cR to xR must also exist on π. Consequently, xRmust lie on the line, lR. This line in the right image frame is known as the epipolar line of xL, and by the same token there exists an epipolar line in the left image frame, lL. This is significant, as a search for corresponding points is now limited to epipolar lines instead of the full image plane. The points of intersection between the baseline and the image planes are called epipoles, and are equivalent to the projections of the camera centres in another view.
The constraints introduced by epipolar geometry are represented in an algebraic form by a matrix, F, called the fundamental matrix. It has already been stated for the camera system in Figure2.3, that there is a relationship between an image point and epipolar lines in other image views. In Multiple View
Geometry [62, pp. 242-243] it is shown geometrically that this relationship is linear and given by
lR= FxL, (2.16)
where F is a 3 × 3 homogeneous matrix with a rank of two. Furthermore, it can be shown that F satisfies the condition,
x>RFxL= 0, (2.17)
for a pair of matched points between image frames. This relationship allows F to be determined from a set of matched features (at least seven matches), without reference to the individual camera matrices. The essential matrix, mentioned in Section 1.2.1, is calculated from
E = K>FK, (2.18)
where K is the camera calibration matrix2defined in Equation2.7. The essential matrix is a specialisation of the fundamental matrix that assumes normalized image coordinates. This assumption introduces
CHAPTER 2. CAMERA GEOMETRY 19
additional constraints and reduces the degrees of freedom, but requires that the cameras have been calibrated and that K is known. Once the essential matrix has been obtained, the relative pose of the cameras between viewpoints can be determined [62, pp. 258-259].
2.2.2
Rectification of Stereo Images
In the previous section, it was shown how the relative pose between two cameras could be determined from a set of 2D-to-2D correspondences. As mentioned in Section1.2.1, there are two alternative motion estimation approaches that make use of 3D-to-3D and 3D-to-2D feature correspondences. It is therefore necessary to calculate the 3D position of a feature given its image coordinates. A process known as image rectification is usually performed before 3D reconstruction takes place. The goal of image rectification is to transform the image planes so that they are co-planar and parallel to the baseline of the camera system.
Consider the unrectified stereo camera pair of Figure2.4awith projection matrices, PL= KLRL[I|−cL]
PR= KRRR[I|−cR],
(2.19)
describing their pose relative to a robot coordinate frame. Projective transformations, HL and HR, are required such that the epipolar lines of the two image planes are transformed in a way which makes them parallel to the baseline as in Figure2.4b. The resulting rectified camera pair can then be expressed as
XL Xr Zr Yr cL xr cR XR YL YR (a) Xr Zr Yr cL xr cR XL YR XR YL (b)
Figure 2.4: Image rectification of stereo camera pair. (a) An unrectified stereo camera pair. (b) The resulting camera system after image rectification has been performed, where the epipolar lines of both image planes are parallel to the baseline.
PL= KrectRrect[I|−cL]
PR= KrectRrect[I|−cR],
(2.20)
where both cameras are modelled as having the same camera calibration matrix, Krect and the same
orientation described by Rrect. The choice of Krectis arbitrary, and taken to be the average of KL and
KR such that
Krect=
1
2(KL+ KR). (2.21)
As shown by Brink et al. [86], each row of Rrect can be determined separately. The first row of Rrect,
which is equivalent to the vector pointing in the direction of the transformed X-axis, must be parallel to the baseline such that,
r1=
cR− cL
kcR− cLk. (2.22)
Furthermore, the transformed Y -axis is orthogonal to r1 and is chosen to be orthogonal to the viewing
axis of the left camera [51],
r2=
u × r1
ku × r1k
CHAPTER 2. CAMERA GEOMETRY 20
where u is the unit vector in the direction of the left camera’s viewing axis. The third row must be orthogonal to both r1 and r2 such that,
r3= r1× r2, (2.24)
where Rrectis given by
Rrect=
h
r1 r2 r3
i>
. (2.25)
It follows that the projective transformations are given such that, HL= KrectRrectR>LK−1L HR= KrectRrectR>RK
−1
R .
(2.26)
The rectified image planes are now determined by re-sampling image points by xrectL = HLxL
xrectR = HRxR,
(2.27)
where xrect
L and xrectR are the rectified image points of the left and right image plane respectively. Consequently, disparities in pixel coordinates between the two image frames exist in the X-direction only, with no disparity in the Y -direction. The advantages of working with a rectified stereo pair are two-fold. First, searching for point correspondences is more efficient due to the simplified epipolar structure and secondly, rectifying images considerably simplifies the triangulation process, which is discussed in the next section.