Capitulo 3: Análisis de los Resultados.
3.3. Análisis de contenido de las Valoraciones Integrales.
In general endoscopic surgery, augmented reality is also used to enhance the view of the surgeon. The methods that are applied to other body parts can possibly also be used for the sinuses. This section describes the state of the art of the AR methods used in endoscopic surgeries apart from sinus surgeries.
State of the art of 3D Augmented reality techniques in general endoscopic surgery
The acquisition of the information necessary for augmented reality are mostly 2D endoscopic images, sometimes combined with pre-operative CT or MRI data. The visualization of the surgical scene is usually showed as 2D endoscopic video. The images are translated to a 3D view sometimes. This is all very similar to the acquisition and visualization for sinus surgeries. An important difference is that the structures of the sinus cavities are rigid, while size and volume of other body parts can differ trough time. In [36] a very broad outline of the steps that are needed to create an AR view is given. This paper states that the techniques nowadays are advanced enough to be applied in medicine but that improvement is still necessary to let the surgeons have even more advantage of the AR displaying methods. The outline of this section will be exactly the same as for the last section about sinus surgery since the process of enhancing the view of the surgeon is more or less the same.
Calibration and registration
Camera calibration is normally easily performed with the use of a checkerboard pat- tern. But due to vibration and refocusing of an endoscopic camera, re-calibration during surgery might be necessary. In [37] this challenge is solved by a continuous framework that checks the performance of the calibration. When the update criteria are met a particle filter is activated to re-calibrate. The epipolar constraints are used as the weighting function.
In [38] a mapping algorithm to register 2D images to 3D volumes is given. This is done for 2D information presented in slices, such as CT or MRI images, so it can be called “slice to volume registration”. This method is capable of capturing in-plane transformations, such as a heartbeat for images of a heart. It makes use of Markov random fields (an explanation of this mathematical method can be found in [39]), based on intensity and independent of the metric. A comparable precision to that in other state of the art methods is reached.
Acquisition
3D one. There are twelve different depth clues identified, and the main conclusion is that combining them gives the most accurate results. Also, the use of multiple images for the acquisition of depth cues gives more accurate results than the use of only one image.
In [41] a comparative validation study of different techniques of 3D reconstruction for laparoscopic surgery is done. In this paper, the focus is on “single-shot” techniques, so there is no movement of the endoscope needed. Validated methods are stereoscopy, Structured Light and Time of Flight. The conclusion is that stereoscopy currently is the only feasible real-time solution to obtain 3D information during surgery. All methods had problems concerning robustness in case of contamination of the lens by smoke or bleeding.
In methods for different body parts than the sinuses, a technique that is often used is shape from shading(SfS). This method is based on the fact that the amount of light reflected by a surface is determined by the orientation of the surface, the position and sort of the light source and the position of the observer or camera. The technique is most often applied dense, to the whole image.
The shape from shading algorithm is used in [42]. In order to obtain the absolute grey gradient field, which is necessary for the SfS algorithm, a technique is used that is called “optical flow”. A grey card is introduced, this can be used to calibrate the re- lationship between the light source intensity and the camera response function. With the use of this relationship the inverse of the image intensity can be obtained to bal- ance the intensity of the original image. This compensated image can then be used to obtain the absolute grey gradient. This, however, is done with only one image, it is not yet applicable to a video stream in real-time.
Another paper about the shape from shading algorithm is [43]. In this article, a very fast version of the algorithm is presented, which makes use of a new image irradiance equation, based on better assumptions than usual. A perspective projection from the camera is assumed, as well as a light source which is close to the object instead of a single point at infinity. The results indicate that this method is faster and more accurate than existing techniques.
The shape from shading method is used to make a comparison to the method in [44]. This paper presents a method for 3D image synthesis from 2D laparoscopic images. The depth information from CT data is incorporated in the 2D laparoscopic image. A number of algorithms for feature detection and matching and also camera track- ing are used. This depth map generation is compared to the that of the shape from shading method. A stereo image, thus creating a 3D effect is created with use of the depth map and the image of the video.
Different algorithms are combined to arrange a 3D panorama visualization of the scene with an endoscopic video in [45]. SfS is used to obtain a depth map, Speed Up Robust Features (SURF) is used for feature detection, Binary Robust Independent Elementary Features (BRIEF) to find matching features and the ICP algorithm to stitch the consecutive frames together. The ICP algorithm is also used in [46], to align different frames. In this paper, Simultaneous Localization and Mapping (SLAM) is used with extra information from the endoscopic camera in order to reduce the diffi- culty of the estimation.
Sometimes, most often in surgeries of body parts that can undergo drastic anatomic changes such as the abdomen, an inter-operative CT image is made, using a CT cone beam arm. The transformation of the organs between the original data and the new data is obtained. But roll, translation along the image axis of the endoscope, zoom and focus can still cause errors in the registration of the CT image to the endoscopic images. The writers of [47] aim to solve these problems by introducing a local formu- lation of shading constraints, they claim that a dense shape from shading algorithm cannot be used because there is no constant albedo throughout the scene. The method makes uses of piecewise constant albedo and light intensity on different patches that are separated from the original images using a watershed algorithm. In this paper, the position of the endoscope is estimated rather that the shape of the organs, which makes the writers introduce the term “pose-from-shading”.
In [48] a method for 3D non-rigid pose estimation for kidney surgery is presented. This method can segment multiple important structures and estimate their pose in the 3D space, intra-operatively. It can correct for non-rigid deformations and also for changing camera parameters, such as zoom or focus. The pose estimation does not need any correspondence between points from the pre-operative and intra-operative data, the user can simply click on structures that are of his interest. This method works with a local optimization framework what makes is rather dependent on the initialization. Since there is a large variability in patients, the first frames from the video data were used to train random forest models for that specific patient. This is not so applicable in real-time since the training of the models consumes a lot of time, but the method is very parallelizable which could make it faster.
Visualization
The way of visualizing the augmented reality view is a big influence to the useful- ness of the system for the surgeon. In [49] four different techniques are described and compared for advantages and disadvantages. The techniques are transparent overlay, virtual window, random dot mask and the ghosting method. A random dot mask is a pattern of small dots through which the underlying structure can be seen. The ghosting method calculated the importance of the camera image and then determines the transparency of the pixel according to that. The table below gives the advantages and disadvantages of the different systems.
Table 4: Comparison of different AR visualization modes in [49]
A virtual reality view and an augmented reality view are synchronized in [50] in order to obtain depth info of objects within the view of the endoscope. The VR and AR views are overlaid on the endoscopic image. The endoscope and other surgical instruments are also displayed in the virtual reality view.