Capítulo II. Estudio técnico – organizativo de la UEB Villamar de Caibarién
2.1 Diagnóstico organizacional Necesidad e importancia
The image acquisition process is the same as described in Section 7.2, with the exception that the need for the bulky black velvet background is removed. The camera is set up pointing at an object placed on the centre of the turntable. In our examples the camera used is a DSLR Cannon 450D and is set to aperture priority with anF value of 5.6. This enables images with a relatively low DoF to be captured. The camera is autofocused to the object on the turntable and then the autofocus is turned off, i.e., the camera parameters are fixed. An image of the calibration pattern is then captured using the fixed camera parameters to calibrate the camera. This is followed by capturing the 60 images of the object to generate the dataset for one object, with the turntable being rotated by approximately 6 degrees between two image captures. The direction of the rotation is noted for calibration purposes. The main difficulty with using the low DoF segmentation method to generate the silhouettes is that regions of the turntable will also be within the DoF. Figure 7.3 illustrates this problem. In column (a) part of the strongly contrasting turntable edge is included in the segmentation. In column (b) both the turntable and the OoI have been segmented because they are both sufficiently in focus when compared to the background. Finally in column (c) dust on the turntable has given it sufficient texture that it returns a focus value within the DoF and thus is segmented along with the model house.
In order to address the above mentioned difficulty, the system is set up such that the turntable is considered out of focus. This is relatively easy to achieve as the turntable is fairly homogeneous, being made entirely of black velvet. Making
sure the DoF is sufficiently low, a relatively flat (in relation to the OoI and the horizontal place) perspective is used during the image capture. This counters the dust effect. To counter the other two effects, the camera is positioned and zoomed in sufficiently such that the front edge of the turntable is not included in the image. The flat angle and the low DoF ensure that the homogeneous turntable does not return high enough focus values to be segmented and ensures that the rear edge of the turntable is out of focus and not sharp enough to return a high focus value. 7.3.2 Image Segmentation
For the low DoF video object segmentation method presented in Chapter 5, two assumptions were made; that there was no change of scene within the video sequence; and that there were no large discrepancies in image composition from one frame to another. Both these assumptions are valid for the sequence of images captured every 6 degrees of the OoI on the turntable. Thus, the 60 sequential images of the OoI are treated as a video sequence of 60 frames and segmented in exactly the same way as described in Chapter 5, namely the active contour for the first frame (or in this case, first image in the sequence) is initialised using the focus intensity maps generated from the first, third and fifth image, and subsequent initial contours are generated from the binary dilation of the previous frame’s final segmentation. This allows for a fast and robust segmentation of the dataset of 60 images to give the silhouettes of the object. The remainder of the 3D object reconstruction process is followed as described in Section 7.2.
7.4
Results
Some examples of automatically generated silhouettes and the reconstructed 3D models are presented in this section. The first object for which a 3D object recon- struction is performed is that of a model house. Figure 7.4 shows the dataset from which the 3D reconstruction is performed. Figure 7.5 shows the automatically gen- erated binary segmentations for each of the 60 view points. To provide some context for the binary segmentations, the segmented object from each of the different view points is shown in Figure 7.6.
Figure 7.4: Greyscale images of a model house, taken every 6 degrees of rotation of the turntable.
Figure 7.5: Binary segmentations of a model house generated for every 6 degrees of rotation of the turntable.
Figure 7.6: Segmentations of a model house generated for every 6 degrees of rotation of the turntable.
(a) (b)
(c) (d)
Figure 7.7: 3D reconstruction of a model house: (a) the octree representation; (b) the reconstructed 3D surface; (c) the octree representation with the estimated object surface colour (d); and the 3D surface model with added colour.
The binary segmentations shown in Figure 7.5 are then used in the SfS method to generate a 3D model of the house. This is illustrated in Figure 7.7, where (a) is the octree representation of the object, (b) shows the 3D surface extracted from the octree representation, and (c) and (d) respectively show the octree and surface representations with colours from the original image projected onto them.
Two more examples of object reconstructions are presented, with a sample of original images and segmentations. Figure 7.8 shows the reconstruction of a model tank, whilst Figure 7.9 shows that of another model house. The reconstructed object models are represented in octree format.
(a) (b) (c) (d)
Figure 7.8: Segmentation and 3D reconstruction of a model tank: (a) the acquired data; (b) the binary segmentations; (c) the segmented objects,;and (d) the resultant octree and coloured octree.
(a) (b) (c) (d)
Figure 7.9: Segmentation and 3D reconstruction of a model house: (a) the acquired data; (b) the binary segmentations; (c) the segmented objects; and (d) the resultant octree representation.
7.5
Conclusion
In this chapter an existing SfS based 3D object reconstruction system has been modified to remove the need for human input. In addition a bulky black backdrop is no longer needed for the silhouette generation process. The method treats the dataset of 60 images, taken every 6 degree rotation of the turntable, as frames in a video sequence and applies the unsupervised video segmentation method presented in Chapter 5 to automatically generate a binary segmentation for each view point. The silhouettes generated are shown to be accurate and are used to construct 3D models of a variety of objects.