• No se han encontrado resultados

3 ADMINISTRACIÓN ELECTRÓNICA:

3.1 ADMINISTRACIÓN ELECTRÓNICA EN ESPAÑA

3.1.5 PROYECTOS RELEVANTES:

In this thesis we discussed a few object detection and segmentation scenarios with partial information. In fact, in many real-world problems there are alternative information sources we can look into to address the partial information issues. In other words, the ambiguities induced by the missing information may be resolved with information sources beyond static images and the auxiliary information discussed in this thesis.

1) Video sequences. Compared to static images, video sequences provide more information about scenes and the objects within. In particular, with additional temporal and spatial cues we are able to identify moving and static objects (e.g., [27]) which may help resolve the appear-

ance variations induced by occlusion, and build a complete and high quality context model. However, the problem is also more challenging as we need to consider additional temporal and spatial priors.

2) Descriptive text. Recent work by Fidler, Sharma and Urtasun [48] suggests text in the form of complex sentential descriptions can help improve the semantic parsing performance for an image. In fact, many images from the Internet are accompanied by text tags, descriptive descriptions, and sometimes questions and answers. Particularly, contextual information can be inferred from descriptive text (e.g., “the chair is behind the table”). Therefore, it is an interesting direction to incorporate textual information into a context-aware object detection and segmentation system.

3) Application-specific sensors. In some specific applications such as satellite imaging and autonomous navigation, we may be supplied with application-specific sensors. For example, spectral cameras provide multispectral imaging data beyond the visible spectrum. The prob- lems are usually also highly domain-specific, meaning that additional domain knowledge can be integrated into the localization task. In practice, more efficient feature extraction and in- ference algorithms are usually necessary for real-time processing. It is an interesting direction to explore some specific applications and make use of additional sensory data to address the partial information issues discussed in this thesis.

1. Amazon mechanical turk. http://www.mturk.com. 115

2. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk. Slic superpixels. École Polytechnique Fédéral de Lausssanne (EPFL), Tech. Rep, 2010. xix, xxi, 101, 102, 104, 106

3. E. Adelson and P. Anandan. Ordinal characteristics of transparency. In AAAI, 1990. 9, 45, 77, 84, 101

4. S. Albrecht and S. Marsland. Seeing the unseen: Simple reconstruction of transparent objects from point cloud data. In workshops. acin. tuwien. ac. at. 47

5. B. Alexe, T. Deselaers, and V. Ferrari. Measuring the objectness of image windows. IEEE Trans. PAMI, 34(11):2189–2202, 2012. 20

6. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. IEEE Trans. PAMI, 33(5):898–916, 2011. 35

7. D. Ballard. Generalizing the hough transform to detect arbitrary shapes. Pattern recog- nition, 13(2):111–122, 1981. 20, 55

8. S. Y. Bao, M. Sun, and S. Savarese. Toward coherent object detection and scene layout understanding. In CVPR, 2010. 6, 33, 53

9. M. Belkin. Problems of learning on manifolds. 2003. 116

10. M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003. 11, 48, 116, 118, 119 11. M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric frame-

work for learning from labeled and unlabeled examples. JMLR, 7:2399–2434, 2006. 48

12. K. Bennett, A. Demiriz, et al. Semi-supervised support vector machines. In NIPS, 1999. 48, 49

13. K. P. Bennett, A. Demiriz, and R. Maclin. Exploiting unlabeled data in ensemble meth- ods. In KDD, 2002. 49

14. I. Biederman, R. J. Mezzanotte, and J. C. Rabinowitz. Scene perception: Detecting and judging objects undergoing relational violations. Cognitive psychology, 14(2):143–177, 1982. 32

15. C. Bishop. Pattern recognition and machine learning. Springer, 2006. 9, 17, 25, 48, 78, 83, 87, 103, 104

16. M. Blaschko and C. Lampert. Object localization with global and local context kernels. In BMVC, 2009. 6, 32, 53

17. A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In COLT, 1998. 48

18. L. Bo, X. Ren, and D. Fox. Depth kernel descriptors for object recognition. In IROS, 2011. 27

19. L. Bo, X. Ren, and D. Fox. Unsupervised feature learning for rgb-d based object recog- nition. In Experimental Robotics, pages 387–402, 2013. 27

20. U. Bonde, V. Badrinarayanan, and R. Cipolla. Robust instance recognition in presence of occlusion and clutter. In ECCV. Springer, 2014. 30

21. E. Borenstein and J. Malik. Shape guided object segmentation. In CVPR, 2006. 35 22. L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3d human pose

annotations. In ICCV, 2009. 3, 19

23. S. Boyd and L. Vandenberghe. Convex optimization. Cambridge Univ Pr, 2004. 122 24. Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph

cuts. IEEE Trans. PAMI, 23(11):1222–1239, 2001. 42

25. Y. Y. Boykov and M.-P. Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001. 40

26. T. Brox, L. Bourdev, S. Maji, and J. Malik. Object segmentation by alignment of poselet activations to image contours. In CVPR, 2011. 30, 35

27. T. Brox and J. Malik. Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. PAMI, 33(3):500–513, 2011. 138

28. F. Buc, Y. Grandvalet, and C. Ambroise. Semi-supervised marginboost. In NIPS, 2002. 49

29. V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours. IJCV, 22(1):61–79, 1997. 46

30. S. Chandra, G. Chrysos, and I. Kokkinos. Surface based object detection in rgbd images. In BMVC, 2016. 27

31. O. Chapelle, B. Scholkopf, and A. Zien, editors. Semi-Supervised Learning. MIT Press, 2006. 48

32. J. Chen, X. Liu, and S. Lyu. Boosting with side information. In ACCV, 2012. 3

33. K. Chen and S. Wang. Regularized boost for semi-supervised learning. In NIPS, 2007. 49

34. W. Choi, Y.-W. Chao, C. Pantofaru, and S. Savarese. Understanding indoor scenes using 3d geometric phrases. In CVPR, 2013. 28

35. F. R. Chung. Spectral graph theory. In CBMS Regional Conference Series in Mathemat- ics, No. 92, 1997. 119

36. M. Collins, R. Schapire, and Y. Singer. Logistic regression, AdaBoost and Bregman distances. Machine Learning, 48(1):253–285, 2002. 49

37. G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV Workshops, 2004. 23

38. N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005. 16, 17, 27, 101

39. A. Demiriz, K. Bennett, and J. Shawe-Taylor. Linear programming boosting via column generation. Machine Learning, 46(1):225–254, 2002. 50, 116, 122

40. S. K. Divvala, D. Hoiem, J. H. Hays, A. A. Efros, and M. Hebert. An empirical study of context in object detection. In CVPR, 2009. 4, 32

41. I. Endres and D. Hoiem. Category-independent object proposals with diverse ranking. IEEE Trans. PAMI, 36(2):222–234, 2014. 20

42. I. Endres, K. J. Shih, J. Jiaa, and D. Hoiem. Learning collections of part models for object recognition. In CVPR, 2013. 19, 115

43. M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PAS- CAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal- network.org/challenges/VOC/voc2012/workshop/index.html. 16, 27

44. A. Fathi, M. Balcan, X. Ren, and J. Rehg. Combining self training and active learning for video segmentation. In BMVC, 2011. 39, 100, 103, 109

45. L. Fei-Fei and P. Perona. A bayesian hierarchical model for learning natural scene cate- gories. In CVPR, 2005. 23

46. P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. IEEE Trans. PAMI, 32(9):1627–1645, 2010. xvii, 18, 19, 29, 58, 66, 67, 71, 72, 75

47. P. F. Felzenszwalb and D. McAllester. Object detection grammars. In ICCV Workshops, 2011. 18

48. S. Fidler, A. Sharma, and R. Urtasun. A sentence is worth a thousand pixels. In CVPR, 2013. 139

49. B. Frank, R. Schmedding, C. Stachniss, M. Teschner, and W. Burgard. Learning the elasticity parameters of deformable objects with a manipulation robot. In IROS, 2010. 8, 78

50. R. Fransens, C. Strecha, and L. Van Gool. A mean field em-algorithm for coherent occlusion handling in map-estimation prob. In CVPR, 2006. 30

51. M. Fritz, M. Black, G. Bradski, and T. Darrell. An additive latent feature model for transparent object recognition. In NIPS, 2009. 44, 46, 77

52. J. Gall and V. Lempitsky. Class-specific hough forests for object detection. In CVPR, 2009. 22, 23, 55

53. T. Gao, B. Packer, and D. Koller. A segmentation-aware object detection model with occlusion handling. In CVPR, 2011. 30, 53

54. A. Garg and D. Roth. Margin distribution and learning algorithms. In ICML, 2003. 116 55. A. Geiger, P. Lenz, and R. Urtasun. Are we ready for autonomous driving? the kitti

vision benchmark suite. In CVPR, 2012. 27

56. S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. PAMI, (6):721–741, 1984. 43

57. R. Girshick. Fast r-cnn. In ICCV, 2015. 17

58. R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014. 17, 28

59. R. B. Girshick, P. F. Felzenszwalb, and D. A. McAllester. Object detection with grammar models. In NIPS, 2011. 19, 30

60. S. Gould, J. Zhao, X. He, and Y. Zhang. Superpixel graph label transfer with learned distance metric. In ECCV, 2014. 39

61. H. Grabner, C. Leistner, and H. Bischof. Semi-supervised on-line boosting for robust tracking. In ECCV, 2008. 124

62. D. Greig, B. Porteous, and A. H. Seheult. Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society. Series B (Methodological), pages 271–279, 1989. 41

63. C. Gu, J. J. Lim, P. Arbeláez, and J. Malik. Recognition using regions. In CVPR, 2009. 26

64. X. Guo, X. Wang, L. Yang, X. Cao, and Y. Ma. Robust foreground detection using smoothness and arbitrariness constraints. In ECCV, 2014. 35

65. A. Gupta, A. A. Efros, and M. Hebert. Blocks world revisited: Image understanding using qualitative geometry and mechanics. In ECCV, 2010. 137

66. S. Gupta, R. Girshick, P. Arbeláez, and J. Malik. Learning rich features from rgb-d images for object detection and segmentation. In ECCV, 2014. 27, 28

67. K. Han, K.-Y. K. Wong, and M. Liu. A fixed viewpoint approach for dense reconstruction of transparent objects. In CVPR, 2015. 47

68. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. 17

69. X. He, R. S. Zemel, and M. Carreira-Perpindn. Multiscale conditional random fields for image labeling. In CVPR, 2004. 38

70. X. He, R. S. Zemel, and D. Ray. Learning and incorporating top-down cues in image segmentation. In ECCV, 2006. 40

71. G. E. Hinton. Training products of experts by minimizing contrastive divergence. Neural computation, 14(8):1771–1800, 2002. 43

72. D. Hoiem, A. A. Efros, and M. Hebert. Geometric context from a single image. In ICCV, 2005. 115

73. D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 80(1):3–15, 2008. 32, 115

74. E. Hsiao and M. Hebert. Occlusion reasoning for object detection under arbitrary view- point. In CVPR, 2012. 30

75. I. Ihrke, K. N. Kutulakos, H. P. Lensch, M. Magnor, and W. Heidrich. State of the art in transparent and specular object reconstruction. In EUROGRAPHICS 2008 STAR–STATE OF THE ART REPORT, 2008. 44

76. S. D. Jain and K. Grauman. Predicting sufficient annotation strength for interactive fore- ground segmentation. In ICCV, 2013. 35

77. A. Janoch, S. Karayev, Y. Jia, J. Barron, M. Fritz, K. Saenko, and T. Darrell. A category- level 3-d object dataset: Putting the kinect to work. In ICCV Workshops, 2011. xvii, 27, 53, 63, 68, 69

78. H. Jiang and J. Xiao. A linear approach to matching cuboids in rgbd images. In CVPR, 2013. 137

79. A. E. Johnson and M. Hebert. Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. PAMI, 21(5):433–449, 1999. 27

80. P. Jolicoeur, M. A. Gluck, and S. M. Kosslyn. Pictures and names: Making the connec- tion. Cognitive psychology, 16(2):243–275, 1984. 15

81. M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to varia- tional methods for graphical models. Machine learning, 37(2):183–233, 1999. 87 82. M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. IJCV,

1(4):321–331, 1988. 35

83. M. Kim, S. Kumar, V. Pavlovic, and H. Rowley. Face tracking and recognition with visual constraints in real-world videos. In CVPR, 2008. xx, 116, 123, 125, 132

84. U. Klank, D. Carton, and M. Beetz. Transparent object detection and reconstruction on a mobile platform. In ICRA, 2011. 8, 46, 77, 78

85. J. Kleinberg and E. Tardos. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. Journal of the ACM, 49(5):616–639, 2002. 42

86. G. J. Klinker, S. A. Shafer, and T. Kanade. A physical approach to color image under- standing. IJCV, 4(1):7–38, 1990. 45

87. S. Kluckner, T. Mauthner, P. Roth, and H. Bischof. Semantic image classification using consistent regions and individual context. In BMVC, 2009. 32

88. P. Kohli, P. H. Torr, et al. Robust higher order potentials for enforcing label consistency. IJCV, 82(3):302–324, 2009. 40

89. D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009. 36

90. V. Koltun. Efficient inference in fully connected crfs with gaussian edge potentials. In NIPS, 2011. 41, 58

91. V. Kompella and P. Sturm. Detection and avoidance of semi-transparent obstacles using a collective-reward based approach. In ICRA, 2011. 45

92. S. Konishi and A. L. Yuille. Statistical cues for domain specific image segmentation with performance analysis. In CVPR, 2000. 38

93. P. D. Kovesi. Matlab and octave functions for computer vision and image processing. Online: http://www. csse. uwa. edu. au/˜ pk/Research/MatlabFns/# match, 2000. 81 94. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convo-

lutional neural networks. In NIPS, 2012. 17

95. M. P. Kumar, P. Ton, and A. Zisserman. Obj cut. In CVPR, 2005. 35

96. S. Kumar and M. Hebert. Discriminative random fields. IJCV, 68(2):179–201, 2006. 38, 40

97. L. Ladick`y, P. Sturgess, K. Alahari, C. Russell, and P. H. Torr. What, where and how many? combining object detectors and crfs. In ECCV, 2010. 35

98. K. Lai, L. Bo, X. Ren, and D. Fox. A large-scale hierarchical multi-view rgb-d object dataset. In ICRA, 2011. 8, 27, 78, 82, 101

99. K. Lai, L. Bo, X. Ren, and D. Fox. Sparse distance learning for object recognition combining rgb and depth information. In ICRA, 2011. 8, 78

100. K. Lai, L. Bo, X. Ren, and D. Fox. Detection-based object labeling in 3d scenes. In ICRA, 2012. 28

101. C. H. Lampert, M. B. Blaschko, and T. Hofmann. Beyond sliding windows: Object localization by efficient subwindow search. In CVPR, 2008. 20

102. D. Larlus and F. Jurie. Combining appearance models and markov random fields for category level object segmentation. In CVPR, 2008. 35

103. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid match- ing for recognizing natural scene categories. In CVPR, 2006. 39

104. D. C. Lee, M. Hebert, and T. Kanade. Geometric reasoning for single image structure recovery. In CVPR, 2009. 137

105. S. Lee and H. Shim. Skewed stereo time-of-flight camera for translucent object imaging. Image and Vision Computing, 43:27–38, 2015. 46

106. A. Lehmann, B. Leibe, and L. Van Gool. Fast prism: Branch and bound hough transform for object class detection. IJCV, 94(2):175–197, 2011. 22

107. A. D. Lehmann, B. Leibe, and L. J. Van Gool. Prism: Principled implicit shape model. In BMVC, 2009. 16

108. Z. Lei, K. Ohno, M. Tsubota, and E. Takeuchi. State of the art in transparent and specular object reconstruction. In ROBIO, 2011. 47

109. B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmenta- tion with an implicit shape model. In ECCV Workshops, 2004. 22, 23, 35, 55, 57 110. C. Leistner, H. Grabner, and H. Bischof. Semi-supervised boosting using visual similar-

ity learning. In CVPR, 2008. 124

111. B. Li, T. Wu, and S.-C. Zhu. Integrating context and occlusion for car detection by hierarchical and-or model. In ECCV, 2014. 30

112. S. Z. Li and S. Singh. Markov random field modeling in image analysis, volume 26. Springer, 2009. 42

113. D. Lin, S. Fidler, and R. Urtasun. Holistic scene understanding for 3d object detection with rgbd cameras. In ICCV, 2013. 28

114. B. Liu, S. Gould, and D. Koller. Single image depth estimation from predicted semantic labels. In CVPR, 2010. 5, 137

115. C. Liu, J. Yuen, and A. Torralba. Nonparametric scene parsing via label transfer. IEEE Trans. PAMI, 33(12):2368–2382, 2011. 38, 39, 100, 110

116. D. Liu, X. Chen, and Y.-H. Yang. Frequency-based 3d reconstruction of transparent and specular objects. In CVPR, 2014. 46

117. L. Liu and S. Sclaroff. Region segmentation via deformable model-guided split and merge. In ICCV, 2001. 34

118. M. Liu, X. He, and M. Salzmann. Building scene models by completing and hallucinat- ing depth and semantics. In ECCV, 2016. 137

119. W. Liu, R. Ji, and S. Li. Towards 3d object detection with bimodal deep boltzmann machines over rgbd imagery. In CVPR, 2015. 28

120. H. Lodhi, G. Karakoulas, and J. Shawe-Taylor. Boosting the margin distribution. In IDEAL, 2000. 116

121. D. G. Lowe. Object recognition from local scale-invariant features. In ICCV, 1999. 17, 27

122. R. C. Luo, P.-J. Lai, and V. W. S. Ee. Transparent object recognition and retrieval for robotic bio-laboratory automation applications. In IROS, 2015. 47

123. I. Lysenkov, V. Eruhimov, and G. Bradski. Recognition and pose estimation of rigid transparent objects with a kinect sensor. In RSS, 2012. 47, 77

124. I. Lysenkov and V. Rabaud. Pose estimation of rigid transparent objects in transparent clutter. In ICRA, 2013. 47

125. C. Ma, X. Lin, J. Suo, Q. Dai, and G. Wetzstein. Transparent object reconstruction via coded transport of intensity. In CVPR, 2014. 46

126. S. Mahamud, L. R. Williams, K. K. Thornber, and K. Xu. Segmentation of multiple salient closed contours from real images. IEEE Trans. PAMI, 25(4):433–444, 2003. 35 127. M. Maire, S. Yu, and P. Perona. Object detection and segmentation from joint embedding

of parts and pixels. In ICCV, 2011. 6, 32, 53

128. S. Maji, A. C. Berg, and J. Malik. Classification using intersection kernel support vector machines is efficient. In CVPR, 2008. 18

129. S. Maji and J. Malik. Object detection using a max-margin hough transform. In CVPR, 2009. xvii, 21, 25, 54, 62, 66, 67, 71, 72, 75

130. J. Malik, S. Belongie, T. Leung, and J. Shi. Contour and texture analysis for image segmentation. IJCV, 43(1):7–27, 2001. 85, 101

131. P. Mallapragada, R. Jin, A. Jain, and Y. Liu. Semiboost: Boosting for semi-supervised learning. IEEE Trans. PAMI, 31(11):2000–2014, 2009. 49, 116, 121, 123

132. D. Martin, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. PAMI, 26(5):530–549, 2004. 81, 82, 90, 91, 108

133. P. McCullagh, J. A. Nelder, and P. McCullagh. Generalized linear models, volume 2. Chapman and Hall London, 1989. 38

134. K. McHenry and J. Ponce. A geodesic active contour framework for finding glass. In CVPR, 2006. 9, 46, 77

135. K. McHenry, J. Ponce, and D. Forsyth. Finding glass. In CVPR, 2005. 9, 44, 45, 77, 84, 91, 101

136. D. Meger, C. Wojek, J. J. Little, and B. Schiele. Explicit occlusion reasoning for 3d object detection. In BMVC, 2011. 30

137. F. Mériaudeau, R. Rantoson, D. Fofi, and C. Stolz. Review and comparison of non- conventional imaging systems for three-dimensional digitization of transparent objects. Journal of Electronic Imaging, 21(2):021105–1, 2012. 44

138. F. Metelli. The perception of transparency. Scientific American, 1974. 45

139. K. Mikolajczyk, B. Leibe, and B. Schiele. Multiple object class detection with a genera- tive model. In CVPR, 2006. 23, 26

140. Y. Ming, H. Li, and X. He. Connected contours: A new contour completion model that respects the closure effect. In CVPR, 2012. 92

141. T. Minka. The summation hack as an outlier model. Tutorial note, 2003. 21

142. G. Mori, X. Ren, A. A. Efros, and J. Malik. Recovering human body configurations: Combining segmentation and recognition. In CVPR, 2004. 35

143. R. Mottaghi, X. Chen, X. Liu, N.-G. Cho, S.-W. Lee, S. Fidler, R. Urtasun, et al. The role of context for object detection and semantic segmentation in the wild. In CVPR, 2014. 32

144. H. Murase. Surface shape reconstruction of an undulating transparent object. In ICCV, 1990. 9, 45, 77

145. P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012. xvii, 27, 53, 63, 68, 69

146. T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invari- ant texture classification with local binary patterns. IEEE Trans. PAMI, 24(7):971–987, 2002. 17

147. A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3):145–175, 2001. 39

148. B. Ommer and J. Malik. Multi-scale object detection by clustering lines. In ICCV, 2009. 23, 26

149. A. Opelt, A. Pinz, and A. Zisserman. Learning an alphabet of shape and appearance for