• No se han encontrado resultados

One key contribution in our work is to jointly reason about boundary and region. Reasoning about boundary and region jointly allows us to combine local features from a boundary and superpixel perspective simultaneously. More importantly, the boundary and region graph can capture more detailed interactions locally, such as the interplay between boundary orientation

RGB Image Depth Frame Se gmentation

Figure 4.9: Failure examples of glass detection on our RGBD Glass dataset. See text for details.

and neighboring regions. In Figure 4.7 and Table 4.3, we have shown that quantitatively this results in a better glass segmentation performance; in this section we show some qualitative examples to justify our design choice. We look at two aspects of our joint inference: the unary terms and the iterative inference process.

Figure 4.10 shows some examples of boundary and region unary classifier outputs. In the first three examples, the boundary classifiers do a better job at identifying local glass bound- ary in general. The region classifiers in these examples give some spurious protrusions and erosions. If we follow the region classifiers, incompatible boundary orientations will be de- rived and penalties in our energy terms will apply to these configurations. In the remaining examples, however, region classifiers are more reliable and this can guide us find the correct boundary configuration.

Figure 4.11 shows some examples of comparisons among the boundary unary classifier output, the boundary marginals with the initial LBP inference involving boundary potentials only, and the boundary marginals with joint prediction after 5 iterations. Although the initial boundary inference helps in strengthening some weak glass boundaries, it is not powerful enough to identify true glass boundaries particularly near noisy predictions. Also, in the last two examples, we have spurious glass boundary detections well outside the glass region. Joint inference helps suppress these boundaries mainly because it is otherwise difficult to find a valid configuration with our constraints on both boundary and region.

4.4

Conclusion

In this chapter, we have proposed a novel approach to glass segmentation with consumer RGBD cameras. By setting up an MRF which jointly encodes boundary fragment and su- perpixel properties and constraints, we proposed a global optimization procedure for glass detection, segmentation and recovery of the noisy depth maps. We validated the efficacy of this approach on our new RGBD Glass dataset, which shows the superior performance of our method.

RGB Image Boundary Unary Region Unary

Figure 4.10: Examples of boundary and region unary terms (magnified, the viewing window is marked as a red bounding box in the RGB images). The boundary orientation is shown as a red arrow pointing towards glass regions. Local boundary and region classifiers provide

RGB Image Boundary Unary Boundary Inference Joint Inference

Figure 4.11: Examples of iterative joint inference. While the initial boundary inference smoothes the unary classifier output, we obtain much cleaner boundary inference results with

Glass Object Segmentation by Label

Transfer on Joint Depth and

Appearance Manifolds

5.1

Introduction

In this chapter, we continue our effort to localize glass objects with RGBD images. In Chap- ter 4 we proposed a joint inference algorithm for glass object segmentation. We exploited the missing-vs-nonmissing pattern in the depth channel which can be used as an effective feature to approximately localize glass objects. Despite our ability to produce high quality segmen- tation from the local estimates through constraints on the joint configurations of the boundary and region, this method has difficulty in handling glass objects with weak RGB cues or strong local deformation of depth missing patterns, as shown in Figure 4.9 and 5.6. One main issue in these cases is that the local estimates are too noisy due to the very large appearance variations at glass boundaries, as shown in a few image patch examples in Figure 5.1. Although relative features focusing on the difference between image patches on both sides of the boundary can reduce feature variation, it is still difficult to train a generic classifier because glass overlays can introduce many different effects such as blurring, highlights, texture distortion, depth miss- ing, etc. The local effects with an individual object instance may be selective and depend on a number of factors including the glass material, illumination, viewpoint, etc. It is therefore difficult to single out each effect and extract more expressive features associated with it.

As a result, we move our focus to methods that are able to deal with large feature variations. Particularly, we propose an image adaptive approach to predicting glass boundaries. Our focus is still on the scenario in which inputs are captured with an RGBD camera. The main idea of our method is to generate boundary proposals based on a nonparametric feature model. Our model is represented by a joint depth and appearance feature manifold, on which each point is the glass boundary feature of an image patch pair. The boundary label of any pair of neighboring patches is predicted by a weighted voting of its nearest neighbors on the feature

training data ... ... ... ... feature manifold

Figure 5.1: Top: Illustration of feature manifold based glass boundary classification. We use a learned feature manifold to match boundary fragments in a test scene (shown as image patches) to a training set in order to predict their labels. Bottom: Large variation on glass boundaries:

patches examples.

manifold. The distance metric on the manifold is learned in a supervised manner.

We then integrate the locally adapted glass boundary predictor into a superpixel-based pairwise Markov Random Field (MRF) for glass object detection and segmentation. The MRF labels every superpixel as glass vs non-glass, in which our boundary prediction is used to modulate the smoothing terms in random fields. As we will show in the experiments, our approach generates more accurate glass boundary predictions, which simplifies the overall model structure and the inference algorithm.

Our work is inspired by the recent progress in nonparametric, data-driven approaches on la- bel transfer and propagation (e.g., [199, 115]). These methods first retrieve a subset of training images based on global image statistics, and use the retrieved images for label transfer on the superpixel level for dense image parsing. In particular, Fathi et al. [44] take a semi-supervised learning approach to learn a metric for label propagation in videos.

Our contributions in this chapter are threefold. Firstly, we propose novel features for glass object segmentation and a flexible feature pool for improving performance. Secondly, our work is the first to explore nonparametric label transfer within the context of glass detection, and exploit a joint depth-appearance manifold for transductive learning. Lastly, we integrate our locally adapted glass boundary detector into an MRF framework for glass object detection and segmentation, achieving a clear improvement to the state-of-the-art on a challenging RGBD Glass dataset in terms of accuracy and speed.

The rest of this chapter is organized as follows. We describe the proposed approach in details in Section 5.2, followed by experimental evaluation and analysis in Section 5.3 and a

brief conclusion in Section 5.4 .

5.2

Our approach

The main idea of our method is to treat every pair of neighboring superpixels as a data unit, and build a feature manifold of such pairs for transferring boundary labels. We design a relative feature for the superpixel pairs in a joint appearance and depth feature space to capture the dif- ference caused by glass overlay. The transferred boundary label predictions are then integrated into a pairwise MRF to generate spatially coherent glass object segmentation.

Documento similar