3.1 PRINCIPIOS BÁSICOS DE PROGRAMACIÓN
3.1.3 SIMBOLOGÍA Y DESCRIPCIÓN DE TODOS LOS ELEMENTOS
This chapter has reviewed several camera calibration techniques which determine the intrinsic and extrinsic parameters, only the intrinsic parameters, or a subset of the in- trinsic parameters. Each paper claims good results. Furthermore, all methods generally rely on accurately extracted points and/or lines.
For a full intrinsic and extrinsic calibration, at least 6 pairs of corresponding 3D and 2D points are required. Therefore if 6 pairs of points could be accurately extracted, then registration would be feasible. Unfortunately, its is dicult to accurately identify human anatomical landmarks automatically. In addition, intrinsic and extrinsic parameters are known to be closely coupled, i.e. inaccurately deduced intrinsic parameters lead to inaccurate extrinsic parameters. Furthermore, most algorithms recommend using many points for calibration e.g. 60 [Tsai, 1987]. This is impractical if not impossible. Therefore an initial strategy would be to use a calibration procedure to determine the intrinsic parameters. This can be done before any registration task, using existing software and an accurately machined calibration object.
Tsai denes the `radius of ambiguity zone' as an error measure. For a given 2D calibration point, a line is projected into 3D space, and the distance to the corresponding 3D point is measured in the plane of the test object. Tsai calibrates a Fairchild CCD 3000 camera with Fuji 25mm lens. Tsai reports an average error of 0.0178 mm and a maximum of 0.0331mm using a single set of coplanar points. He then uses a second camera to provide stereo reconstruction through triangulation, calibrates both cameras using multiple sets of coplanar points, and measures the error as 0.0198 mm. The computational time is 9 seconds for the latter case.
Weng uses dierent error measures, but reports an accuracy of 0.437 mm for recon- structed 3D points using two Cosmicar 25 mm tele-lenses. This is higher than Tsai, but cannot be directly compared as Tsai's and Weng's experimental setup are dierent. Computing the calibration matrix and then extracting parameters has been performed
3.6 Comparison Of Algorithms
87
by Strat [Strat, 1984], Ganapathy [Ganapathy, 1984] and Faugeras [Faugeras, 1993], but they don't evaluate the accuracy of the evaluation in terms of metrics like `radius of ambiguity zone' or the accuracy with which 3D points can be reconstructed. Faugeras [Faugeras, 1993] and Robert [Robert, 1996], both demonstrate the variation in recovered camera parameters when noise is added, but this depends on experimental setup and does not indicate an absolute measure. King et al. [King et al., 1999] use Tsai's camera calibration to calibrate a xed zoom and focus operating microscope with an accuracy of 0.26 mm at the focal plane and 0.3-0.4 mm for a variable zoom and focus calibration. In summary, a full point based registration will be extremely dicult to do. How- ever, the intrinsic parameters can be retrieved through calibration using existing meth- ods. Tsai's method is widely cited, often used as a benchmark and freely available
at http://www.cs.cmu.edu/~rgw/. Therefore, in general, Tsai's method will be used
throughout this thesis.
3.6.2 Pose Estimation
If the intrinsic parameters are recovered through calibration, then the registration prob- lem reduces to one of pose estimation, i.e. estimating the extrinsic parameters. However, the problem of pose estimation in computer vision is markedly dierent from the medical registration problem considered in this thesis. Many of the published pose estimation al- gorithms rely on being able to extract points [Fishler and Bolles, 1981; Wolfe et al., 1991; Haralick et al., 1994; DeMenthon and Davis, 1992b], points and lines [Liu et al., 1990; Phong et al., 1995] or angles [Wu et al., 1994]. For this type of problem, pose estimation is widely studied. However, for the same reason as above, such easily identiable points, lines or features are unlikely to be present in a medical scene.
View based pose estimation is based on comparing a test image against a database of pre- stored images. Although, this does not require an explicit 3D model and hence will not require feature extraction, it does require that the subject of interest be available before analysis takes place. These methods have practical limitations in terms of memory/disk usage, pre-processing time and feasible accuracy. Sucient images must be captured to describe the likely poses and illuminations. This could involve many hundreds of images and many gigabytes of disk space. Next, principle component (eigenvector) analysis must be performed, which does fortunately reduce required disk space. When a test image is acquired it is compared with those in the database. Results show that pose estimation
3.6 Comparison Of Algorithms
88
may be accurate to 0.5 - 1.0 degrees [Murase and Nayar, 1995b], but in this paper, these methods only estimated one pose parameter, a rotation. In principle these methods could be extended to recover all six extrinsic parameters, but at increased inconvenience in data collection and storage, and increased computational cost.
These methods were classied as unsuitable due to the diculty of collecting enough images to describe all possible poses and illuminations, the fact that the patient may not be available before an operation, the fact that the surface that needs registering may not be visible before an operation, and the surface may be occluded, or its appearance may change during an operation.
3.6.3 Tracking
In section 3.3, tracking related algorithms were reviewed. It was stated that algorithms for computing `structure from motion' were not applicable to this registration task. One method used a whole image sequence to reconstruct a set of points that matched the information in a series of video images. Another method computed motion estimates from optical ow. Optical ow produces a dense approximation of the true motion eld but again requires two images to compute the change in intensity pattern over time. Thus these methods are not relevant for registering a model to a single frame. The most relevant tracking algorithms are those which match a model with an image, and in general these methods calculate the extrinsic camera parameters using iterative procedures such as Newton's optimisation to minimise a cost function and are essentially similar to the pose estimation algorithms, except the emphasis is on speed. For the same reasons that most computer vision algorithms are not relevant to this medical application, neither are the tracking algorithms.