• No se han encontrado resultados

4. MARCO TEÓRICO

4.7 Teoría del Aprendizaje Social

5.4.2 Muestra

Many modern 3D sensing technologies can provide the user with a depth map and a colour image of the same scene. While the completion process on its own is focused on the depth image, there is valuable information contained within the accompanying colour image that can significantly improve the quality of the results. There are approaches that take advantage of the object boundaries and edges of the colour image to preserve and align the structures in the depth [41,145,148]. Even so, it has been pointed out that this can still lead to undesirable artefacts around edges and object boundaries since colour and depth

Input Image Required Advantages Disadvantages Examples of Filling Techniques Depth and Colour Images • more processing information • possible lack of colour input [37], [129], [128], [131], [173], [148], [139], [140]

• more accurate results • more computationally intensive [41], [40], [133], [120], [122], [150], [157], [36] Depth Image Only • no dependence on extra inputs • less information for processing [138], [141], [142], [171], [170], [143], [42], [144]

• more efficient processing • lower quality outputs [126], [44], [151], [152], [45], [132], [184], [68]

Table 2.2: Examples of depth completion approaches categorised according to the type of images required as their input.

edges are characteristically different [144]. Some other approaches have taken to using the colour image as a means to segment the scene before depth completion takes place [43,129,144], which can provide the completion process with semantically valid scene objects to sample homogeneous depth information from.

However, despite the advantages the colour information can offer, not all depth acquisition technologies produce an aligned or easily alignable colour image and requirements of the application may not always allow for the additional computation that comes with the colour image processing. In these situations, a depth completion approach that is fully dependent on the colour image as a secondary guidance image may not be desirable.

As such, we provide a simple overview of depth completion approaches by categorising them based on the their use of the colour image to provide guidance for the depth completion process. Table

2.2presents the aforementioned split over the depth completion techniques commonly used within the literature. Moreover, Figure2.11provides a taxonomy of the literature based on the requirements of the approaches in terms of their dependence on a secondary input images and the information domain used for the completion process.

Discussion: Among depth completion approaches, some heavily rely on the view of the scene in colour to guide the depth completion process. While this can positively affect the outcome in terms of quality and consistency, certain limitations ensue. Aside from the colour image not being available at all times, com- putational requirements can create issues when the application demands light and real-time processing. As seen in Table2.2and Figure2.11, a variety of approaches operate in both spaces, providing a wide range of opportunities to select the appropriate depth completion techniques.

2.2.4

Texture, Boundaries and Smoothing

Four simple rules were proposed in [58] to provide a set of guidelines for generating more plausible and realistic results when attempting to solve the problem of colour image completion (Section 2.1). While not all of these rules apply to depth images (depth images obviously do not contain any colour

Accurate Structure and Smooth Surfaces [126], [74], [73], [45], [138], [139], [68], [133], [143], [125], [146]

Table 2.3: Examples of depth completion approaches categorised according to the main focus of the approach (structure vs. texture and accurate boundaries).

information), preserving texture, relief and clear object boundaries or smoothing can be important factors in a depth completion approach depending on the circumstances under which it needs to operate.

In certain downstream applications, fine-grained texture and relief over surfaces and a clear separation between objects within the depth image is of utmost importance [140], whereas smooth and consistent scene depth [125,142] can satisfy the requirements of other systems.

It is important to note that preserving fine relief within the depth information of a scene object is a difficult task. Additionally, depth completion is an inherently ill-posed problem. As a result, if texture and relief generation is unnecessarily carried out based on insufficient information, the resulting output can contain more outliers and invalid depth information, which is a hindrance on its own.

Table2.3presents a list of depth completion approaches categorised according to their main objectives. Some techniques concentrate on providing very accurate texture and object boundaries, while others generate overly smooth depth in the output with their focus on the structural integrity of the scene depth.

Discussion: The exact characteristics of a depth image depend on its purpose. In certain applications such as object recognition [186,187] or detection [188], accurate boundaries and relief of an object in the depth image can play an important role in the semantic value of that object within the scene. However, other applications such as localisation and mapping [118,119] do not require fine texture and relief for each individual scene object accurate structure within the scene depth is sufficient. As seen in Table2.3, different depth completion techniques exist that can generate complete depth either with fine relief or smoothed object surfaces.

2.3

Monocular Depth Estimation

Over the past few years, research into monocular depth estimation, i.e. predicting complete scene depth from a single RGB image, has significantly escalated [46,47,189–192]. Using off-line model training based on ground truth depth data, monocular depth prediction has been made possible [46, 189, 190,

193,194] sometimes with results surpassing those of more classical depth estimation techniques. Ground truth depth, however, is extremely difficult and expensive to acquire and when it is obtained it is often

sparse and flawed, constraining the practical use of monocular depth estimation in real-world applications. Solutions to this problem of data scarcity include the possibility of using synthetic data containing sharp pixel-perfect scene depth [195] for training or completely dispensing with using ground truth depth and instead utilising a secondary supervisory signal during training which indirectly results in producing the desired depth [47,191,192,196].

As a portion of the work done for this thesis is within this area (Chapter 6), in the following, a brief description of monocular depth estimation techniques within three relevant areas is presented: approaches utilising hand-crafted features based on monocular cues within the RGB input image, approaches based on graphical models and finally techniques using deep neural networks trained in various ways to estimate depth from a single image.

Documento similar