Afecto y acogida - DE RESULTADOS - Evaluación del proyecto de aulas cooperativas multitarea de

BLOQUE 1 DE RESULTADOS

1. Afecto y acogida

The goal of this dissertation is to improve the applicability of pedestrian detection

for real-life applications. Although we presented multiple techniques that

successfully improved both the accuracy and the detection speed, a guaranteed success in all circumstances is not reached. Real-life applications always have

challenges, which we currently do not take into account. These could be

as "simple" as occlusion or taking into account different appearances of the pedestrian such as falling down, which would require to use additional models

for parts of the person [65,85] or pose variations [95], but also as complex as

fog or night time, which make the use of commonly used cameras, functioning mainly in the visual spectrum, unreliable.

Therefore we should look into the integration with other hardware that could reliably used in these situations, to overcome such flaws. One possibility is the use of an Infra Red (IR) camera, which allows using the body heat of pedestrians. But such hardware is not limited to cameras, e.g. ultrasonic distance sensors could also be very helpfull. An important criterion when selecting such hardware is the complementarity over other used techniques, such that flaws of one technology are solved by another. The output of such hardware could be combined to obtain a higher average detection performance

[54], but should also benefit from a technique to detect the context (night time,

fog, ...) that detectors that could not handle these circumstances are avoided. Currently we can find an integration of Infra Red cameras and visual spectrum

cameras in literature [54], where the heat information is used as an additional

channel in a Channel Based detector (ACF). But this is not the only, or maybe not even the best, way to combine multiple information sources. Training a single model based on all the information sources does not take into account that certain sensors will not perform well in all circumstances (such as visual spectrum cameras at night). Therefore it may be more beneficial to create multiple models, and combine them similarly to our The Combinator of chapter

3technique, where the confidence depends on the context it is used in (day, night, fog, ...) such that the best working technique has the advantage. Techniques could also be implemented as a cascade. Similar to the technique of the VeryFast

detector [6] where pedestrians are only searched for on elements perpendicular

to the ground plane (Stixels) matching a pedestrian’s height, pedestrians, or part-models when occlusion is taken into account, can be searched for only on locations that radiate sufficient heat, according to an IR-camera.

But also inside the possibilities of classic pedestrian detection, and our techniques, there are some flaws. For example, in the exploitation of the

ground constraint we described in section5.5, we calibrate a first order function

based on the pedestrian height of adults. Although the accuracy improves over a complete dataset when applying the ground constraint by pruning false detections. This works for the test dataset since the larger portion of the annotations is in fact composed of adult pedestrians, almost no children are present. Note that this implies that pedestrians that do not cope with this constraint, including (small) children, will incorrectly be ignored. In the context of safety-critical applications, such as detection of vulnerable road users, this is a large safety risk! A possible solution to cope with this particular problem, is that we could use multiple calibrations, each targeting a specific pedestrian scale (range). Another possibility would be to use the scale of the pedestrian as part of a tracker’s state vector.

Before using pedestrian detection techniques in practical applications, it should be validated on a dataset containing very diverse appearances of, and possible scenarios with, pedestrians. Each important type of occurrence (adults, children, cyclists, wheel chair users, elderly, people that have fallen down, ...) should be validated independently of the others under the applied constraints. This to ensure that all of these are sufficiently covered, and possible constraints do not form the risk of ignoring particular categories.

Bibliography

[1] Ahonen, T., Hadid, A., and Pietikäinen, M. Face recognition with local binary patterns. In European Conference on Computer Vision (ECCV). Springer, 2004, pp. 469–481.

[2] Anthony, W. C++ concurrency in action. practical multithreading, 2008. [3] Appel, R., Fuchs, T., Dollár, P., and Perona, P. Quickly boosting

decision trees-pruning underachieving features early. In JMLR Workshop

and Conference Proceedings (2013), vol. 28, JMLR, pp. 594–602.

[4] Bach, M. J. The design of the UNIX operating system, vol. 5. Prentice-Hall Englewood Cliffs, NJ, 1986.

[5] Benenson, R., Mathias, M., Timofte, R., and Van Gool, L. Fast stixel computation for fast pedestrian detection. In European Conference

on Computer Vision (ECCV) (2012), pp. 11–20.

[6] Benenson, R., Mathias, M., Timofte, R., and Van Gool, L. Pedestrian detection at 100 frames per second. In Computer Vision and

Pattern Recognition (CVPR) (2012), pp. 2903–2910.

[7] Benenson, R., Mathias, M., Tuytelaars, T., and Van Gool, L. Seeking the strongest rigid detector. In Computer Vision and Pattern

Recognition (CVPR) (2013).

[8] Benenson, R., Omran, M., Hosang, J., , and Schiele, B. Ten years of pedestrian detection, what have we learned? In European Conference

on Computer Vision Workshop (ECCVW) (2014).

[9] Benenson, R., Timofte, R., and Van Gool, L. Stixels estimation without depth map computation. In International Conference on Computer

Vision Workshops (ICCV Workshops) (2011), IEEE, pp. 2010–2017.

[10] Benfold, B., and Reid, I. Stable multi-target tracking in real-time surveillance video. In Computer Vision and Pattern Recognition (CVPR) (June 2011), pp. 3457–3464.

[11] Bennett, S. A history of control engineering. Control Engineering Series (1930).

[12] Bourdev, L., and Brandt, J. Robust object detection via soft cascade. In Computer Vision and Pattern Recognition (CVPR) (2005), vol. 2, IEEE, pp. 236–243.

[13] Boyle, R. Belgium has nearly 300,000 surveillance cameras. http://www.

xpats.com/belgium-has-nearly-300000-surveillance-cameras. Ac- cessed: 2015-08-09.

[14] Bradski, G. Open computer vision library. Dr. Dobb’s Journal of Software

Tools (2000).

[15] Breitenstein, M. D., Reichlin, F., Leibe, B., Koller-Meier, E., and Van Gool, L. Robust tracking-by-detection using a detector confidence particle filter. In International Conference on Computer Vision

(ICCV) (2009), IEEE, pp. 1515–1522.

[16] Bron, C., and Kerbosch, J. Algorithm 457: finding all cliques of an undirected graph. Communications of the ACM 16, 9 (1973), 575–577. [17] Chandrakasan, A. P., Potkonjak, M., Mehra, R., Rabaey, J., and

Broderse, R. W. Optimizing power using transformations. Computer-

Aided Design of Integrated Circuits and Systems (TCAD) 14, 1 (1995),

12–31.

[18] Cho, H., Rybski, P., Bar-Hillel, A., and Zhang, W. Real-time pedestrian detection with deformable part models. In Intelligent Vehicles

Symposium (IV) (August 2012).

[19] Dalal, N., and Triggs, B. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition (CVPR) (June 2005), vol. 2, pp. 886–893.

[20] De Beugher, S., Brône, G., and Goedemé, T. Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In VISAPP (2014).

[21] De Smedt, F., and Goedemé, T. Open framework for combined pedestrian detection. In VISAPP (2015), pp. 551–559.

BIBLIOGRAPHY 153

[22] De Smedt, F., Hulens, D., and Goedemé, T. On-board real-time

tracking of pedestrians on a uav. In Computer Vision and Pattern

Recognition Workshops (CVPRW) (2015).

[23] De Smedt, F., Struyf, L., Beckers, S., Vennekens, J.,

De Samblanx, G., and Goedemé, T. Is the game worth the

candle? evaluation of opencl for object detection algorithm optimization.

Perservative and Embedded Computing and Communication Systems (PECCS) (2012), 284–291.

[24] De Smedt, F., Struyf, L., Beckers, S., Vennekens, J., De Samblanx, G., and Goedemé, T. Faster and more intelligent object detection by combining opencl and kr. Journal of Ambient Intelligence

and Humanized Computing (JAIHC) 5, 5 (2014), 635–643.

[25] De Smedt, F., Van Beeck, K., Tuytelaars, T., and Goedemé, T. Pedestrian detection at warp speed: Exceeding 500 detections per second. In Computer Vision and Pattern Recognition Workshops (CVPRW) (2013), IEEE, pp. 622–628.

[26] De Smedt, F., Van Beeck, K., Tuytelaars, T., and Goedemé, T. The combinator: optimal combination of multiple pedestrian detectors. In

International Conference on Pattern Recognition (ICPR) (2014), IEEE,

pp. 3522–3527.

[27] Dollár, P. Piotr’s Computer Vision Matlab Toolbox (PMT). http:

//vision.ucsd.edu/~pdollar/toolbox/doc/index.html.

[28] Dollár, P., Appel, R., Belongie, S., and Perona, P. Fast feature pyramids for object detection. Pattern Analysis and Machine Intelligence

(PAMI) 36, 8 (2014), 1532–1545.

[29] Dollár, P., Appel, R., and Kienzle, W. Crosstalk cascades for frame- rate pedestrian detection. In European Conference on Computer Vision

(ECCV). Springer, 2012, pp. 645–659.

[30] Dollár, P., Belongie, S., and Perona, P. The fastest pedestrian detector in the west. In British Machine Vision Conference (BMVC) (2010), vol. 2, Citeseer, p. 7.

[31] Dollár, P., Tu, Z., Perona, P., and Belongie, S. Integral channel features–addendum.

[32] Dollár, P., Tu, Z., Perona, P., and Belongie, S. Integral channel features. In British Machine Vision Conference (BMCV) (2009).

[33] Dollár, P., Wojek, C., Schiele, B., and Perona, P. Pedestrian detection: A benchmark. In Computer Vision and Pattern Recognition

(CVPR) (June 2009).

[34] Dollár, P., Wojek, C., Schiele, B., and Perona, P. Pedestrian detection: An evaluation of the state of the art. Pattern Analysis and

Machine Intelligence (PAMI) 99 (2011).

[35] Dubout, C., and Fleuret, F. Exact acceleration of linear object detectors. In European Conference on Computer Vision (ECCV). Springer, 2012, pp. 301–311.

[36] Enzweiler, M., and Gavrila, D. M. Monocular pedestrian detection:

Survey and experiments. Pattern Analysis and Machine Intelligence

(PAMI) 31, 12 (Dec. 2009), 2179–2195.

[37] Ess, A., Leibe, B., Schindler, K., , and van Gool, L. A mobile vision system for robust multi-person tracking. In Computer Vision and

Pattern Recognition (CVPR) (June 2008), IEEE.

[38] Everingham, M., Van Gool, L., Williams, C., Winn, J., and Zisserman, A. The pascal visual object classes challenge 2009. In 2th

PASCAL Challenge Workshop (2009).

[39] Felzenszwalb, P., McAllester, D., and Ramanan, D. A

discriminatively trained, multiscale, deformable part model. In Computer

Vision and Pattern Recognition (CVPR) (2008), IEEE, pp. 1–8.

[40] Felzenszwalb, P. F., Girshick, R. B., and McAllester,

D. Discriminatively trained deformable part models, release 4.

http://people.cs.uchicago.edu/ pff/latent-release4/.

[41] Felzenszwalb, P. F., Girshick, R. B., and Mcallester, D. Cascade object detection with deformable part models. In Computer Vision and

Pattern Recognition (CVPR) (2010).

[42] Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. Object detection with discriminatively trained part based models. Pattern Analysis and Machine Intelligence (PAMI) 32, 9 (2010), 1627–1645.

[43] Fukunaga, K., and Hostetler, L. D. The estimation of the gradient of a density function, with applications in pattern recognition. Transactions

BIBLIOGRAPHY 155

[44] Garey, M. R., and Johnson, D. S. Complexity results for multiprocessor

scheduling under resource constraints. SIAM Journal on Computing

(SICOMP) 4, 4 (1975), 397–411.

[45] Geiger, A. Probabilistic Models for 3D Urban Scene Understanding from

Movable Platforms. PhD thesis, Karlsruher Institut für Technologie (KIT),

2013.

[46] Geiger, A., Lauer, M., Wojek, C., Stiller, C., and Urtasun, R. 3d traffic scene understanding from movable platforms. Pattern Analysis

and Machine Intelligence (PAMI) (2014).

[47] Girshick, R. B., Felzenszwalb, P. F., and McAllester,

D. Discriminatively trained deformable part models, release 5.

http://people.cs.uchicago.edu/ rbg/latent-release5/.

[48] Gove, D. Multicore Application Programming: For Windows, Linux, and

Oracle Solaris. Addison-Wesley Professional, 2010.

[49] Grabner, H., and Bischof, H. On-line boosting and vision. In Computer

Vision and Pattern Recognition (CVPR) (2006), vol. 1, IEEE, pp. 260–267.

[50] Hariharan, B., Malik, J., and Ramanan, D. Discriminative

decorrelation for clustering and classification. In European Conference

on Computer Vision (ECCV ). Springer, 2012, pp. 459–472.

[51] Hosang, J., Omran, M., Benenson, R., and Schiele, B. Taking a deeper look at pedestrians. In Computer Vision and Pattern Recognition

(CVPR) (2015).

[52] Hulens, D., Goedemé, T., and Rumes, T. Autonomous lecture recording with a ptz camera while complying with cinematographic rules. In Canadian Conference on Computer and Robot Vision (CRV) (2014), IEEE, pp. 371–377.

[53] Hulens, D., Verbeke, J., and Goedemé, T. How to choose the best embedded processing platform for on-board uav image processing ? In International Joint Conference on Computer Vision, Imaging and

Computer Graphics Theory and Applications. (2015), IEEE.

[54] Hwang, S., Park, J., Kim, N., Choi, Y., and So Kweon, I. Multispectral pedestrian detection: Benchmark dataset and baseline. In

Computer Vision and Pattern Recognition (CVPR) (June 2015).

[55] Kalman, R. E., et al. A new approach to linear filtering and prediction problems. International Journal of basic Engineering (IJE) 82, 1 (1960), 35–45.

[56] Khronos Group. Opencl - the open standard for parallel programming of heterogeneous systems, 2011.

[57] King, D. E. Dlib-ml: A machine learning toolkit. Journal of Machine

Learning Research (JMLR) 10 (2009), 1755–1758.

[58] Kirk, D., and Hwu, W.-m. Programming massively parallel processors:

a hands-on approach. Newnes, 2012.

[59] Kristan, M., Pflugfelder, R., Leonardis, A., Matas, J., Porikli, F., Cehovin, L., Nebehay, G., Fernandez, G., and Vojir, T. The vot2013 challenge: overview and additional results. In Computer Vision

Winter Workshop (CVWW) (2014).

[60] Krizhevsky, A., Sutskever, I., and Hinton, G. Imagenet

classification with deep convolutional neural networks. In Advances in

neural information processing systems (NIPS) (2012), pp. 1097–1105.

[61] Kuhn, H. W. The hungarian method for the assignment problem. Naval

research logistics quarterly 2, 1-2 (1955), 83–97.

[62] Ladický, L., Sturgess, P., Alahari, K., Russell, C., and Torr, P. H. S. What, where and how many? Combining object detectors and CRFs. In European Conference on Computer Vision (ECCV) (2010), pp. 424–437. [63] Lowe, D. G. Distinctive image features from scale-invariant keypoints.

International Journal on Computer Vision (IJCV) 60, 2 (Nov. 2004),

91–110.

[64] Lucas, B. D., Kanade, T., et al. An iterative image registration technique with an application to stereo vision. In Internation Joined

Conference on Artificial Intelligence (IJCAI) (1981), vol. 81, pp. 674–679.

[65] Mathias, M., Benenson, R., Timofte, R., and Van Gool, L. Handling occlusions with franken-classifiers. In International Conference

on Computer Vision (ICCV) (2013).

[66] Munshi, A., Gaster, B., Mattson, T. G., and Ginsburg, D. OpenCL

programming guide. Pearson Education, 2011.

[67] Nam, W., Dollár, P., and Han, J. H. Local decorrelation for improved

pedestrian detection. In Advances in Neural Information Processing

Systems (NIPS) (2014), pp. 424–432.

[68] Naseer, T., Sturm, J., and Cremers, D. Followme: Person following and gesture recognition with a quadrocopter. In International Conference

BIBLIOGRAPHY 157

[69] Ojala, T., Pietikäinen, M., and Harwood, D. A comparative study of texture measures with classification based on featured distributions.

Pattern recognition 29, 1 (1996), 51–59.

[70] Okuma, K., Taleghani, A., De Freitas, N., Little, J. J., and Lowe, D. G. A boosted particle filter: Multitarget detection and tracking. In European Conference on Computer Vision (ECCV). Springer, 2004, pp. 28–39.

[71] Ouyang, W., and Wang, X. Joint deep learning for pedestrian detection. In International Conference on Computer Vision (ICCV) (2013), IEEE, pp. 2056–2063.

[72] Ouyang, W., and Wang, X. Single-pedestrian detection aided by multi- pedestrian detection. In Computer Vision and Pattern Recognition (CVPR) (2013), IEEE, pp. 3198–3205.

[73] Park, D., Ramanan, D., and Fowlkes, C. Multiresolution models for object detection. In European Conference on Computer Vision (ECCV). Springer, 2010, pp. 241–254.

[74] Park, D., Zitnick, C. L., Ramanan, D., and Dollár, P. Exploring weak stabilization for motion feature extraction. In Computer Vision and

Pattern Recognition (CVPR) (2013), IEEE, pp. 2882–2889.

[75] Pedersoli, M., Vedaldi, A., and Gonzalez, J. A coarse-to-fine approach for fast deformable object detection. In Computer Vision and

Pattern Recognition (CVPR) (2011), IEEE, pp. 1353–1360.

[76] Pestana, J., Sanchez-Lopez, J. L., Saripalli, S., and Campoy, P. Computer vision based general object following for gps-denied multirotor unmanned vehicles. In American Control Conference (ACC) (2014), IEEE, pp. 1886–1891.

[77] PETS. Pets 2010 benchmark data.

http://www.cvg.rdg.ac.uk/PETS2010/a.html, 2010.

[78] Prisacariu, V., and Reid, I. fastHOG - a real-time gpu implementation of HOG. Tech. rep., Department of Engineering Science, Oxford University, 2009.

[79] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. Imagenet large scale visual recognition challenge. International Journal of

[80] Sadeghi, M. A., and Forsyth, D. Fast template evaluation with vector

quantization. In Advances in Neural Information Processing Systems

(NIPS) (2013), pp. 2949–2957.

[81] Sadeghi, M. A., and Forsyth, D. 30hz object detection with dpm v5. In European Conference on Computer Vision (ECCV). Springer, 2014, pp. 65–79.

[82] Siek, J. G., Lee, L.-Q., and Lumsdaine, A. The boost graph library. c++ in-depth series, 2002.

[83] Stix, V. Finding all maximal cliques in dynamic graphs. Computational

Optimization and applications 27, 2 (2004), 173–186.

[84] Sudowe, P., and Leibe, B. Efficient use of geometric constraints for sliding-window object detection in video. In Computer Vision Systems

(CVS). Springer, 2011, pp. 11–20.

[85] Tang, S., Andriluka, M., and Schiele, B. Detection and tracking of occluded people. International Journal of Computer Vision (IJCV) 110, 1 (2014), 58–69.

[86] Van Beeck, K., De Smedt, F., Beckers, S., Struyf, L., Vennekens, J., De Samblanx, G., Goedemé, T., and Tuytelaars, T. Towards robust automatic detection of vulnerable road users: monocular pedestrian tracking from a moving vehicle. In ATINER 7th annual international

conference on computer science and information systems (2011).

[87] Van Beeck, K., Goedemé, T., and Tuytelaars, T. A warping window approach to real-time vision-based pedestrian detection in a truck’s blind spot zone. In International Conference on Informatics in Control,

Automation and Robotics (ICINCO) (2012), vol. 2, pp. 561–568.

[88] Vermaak, J., Doucet, A., and Pérez, P. Maintaining multimodality through mixture tracking. In International Conference on Computer Vision

(ICCV) (2003), IEEE, pp. 1110–1116.

[89] Viola, P., and Jones, M. Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition

(CVPR) (2001), vol. 1, pp. 511–518.

[90] Welch, G., and Bishop, G. An introduction to the kalman filter. 2006.

University of North Carolina: Chapel Hill, North Carolina, US (2006).

[91] Willems, J., Debard, G., Bonroy, B., Vanrumste, B., and Goedeme, T. How to detect human fall in video. In Positioning and

BIBLIOGRAPHY 159

[92] Xu, P., Davoine, F., and Denœux, T. Evidential combination of pedestrian detectors. In British Machine Vision Conference (BMCV) (2014).

[93] Yan, J., Lei, Z., Wen, L., and Li, S. Z. The fastest deformable part model for object detection. In Computer Vision and Pattern Recognition

(CVPR) (2014), IEEE, pp. 2497–2504.

[94] Yan, J., Zhang, X., Lei, Z., Liao, S., and Li, S. Z. Robust multi- resolution pedestrian detection in traffic scenes. In Computer Vision and

Pattern Recognition (CVPR) (2013), IEEE, pp. 3033–3040.

[95] Yang, Y., and Ramanan, D. Articulated pose estimation with flexible mixtures-of-parts. In Computer Vision and Pattern Recognition (CVPR) (2011), IEEE, pp. 1385–1392.

[96] Zeng, X., Ouyang, W., and Wang, X. Multi-stage contextual deep learning for pedestrian detection. In International Conference on Computer

Vision (ICCV) (2013), IEEE, pp. 121–128.

[97] Zhang, H., Geiger, A., and Urtasun, R. Understanding high-level semantics by modeling traffic patterns. In International Conference on

Computer Vision (ICCV) (2013).

[98] Zhang, S., Benenson, R., and Schiele, B. Filtered channel features for pedestrian detection. In Computer Vision and Pattern Recognition

(CVPR) (2015).

[99] Ziegler, J. G., and Nichols, N. B. Optimum settings for automatic controllers. American Society on Mechanical Engineering (ASME) 64, 11 (1942).

List of publications

International conference publications

[1] Sander Beckers, Gorik De Samblanx, Floris De Smedt, Toon Goedemé, Lars Struyf, and Joost Vennekens, Parallel sat-solving with opencl, IADIS International Conference on Applied Computing, 2011, pp. 435–441. [2] Sander Beckers, Gorik De Samblanx, Floris De Smedt, Toon Goedemé,

Lars Struyf, and Joost Vennekens, Parallel hybrid sat solving using opencl, Benelux Conference on Artificial Intelligence (BNAIC) (2012), 11–18. [3] Gorik De Samblanx, Floris De Smedt, Lars Struyf, Sander Beckers, Joost

Vennekens, and Toon Goedemé, Cpcpu: Coreful programming on the

cpu: Why a cpu can benifit from massive multithreading, Perservative

and Embedded Computing and Communication Systems (PECCS), 2012, pp. 196–199.

[4] Floris De Smedt, Ive Billiauws, and Toon Goedemé, Neural networks

and low-cost optical filters for plant segmentation, IADIS conference on

computer graphics, visualization, computer vision and image processing, 2010, pp. 1–8.

[5] Floris De Smedt and Toon Goedemé, Fast rotation invariant object detection

with gradient based detection models, VISAPP, vol. 2, VISIGRAPP, 2015,

pp. 400–407.

[6] Floris De Smedt and Toon Goedemé, Open framework for combined

In document Evaluación del proyecto de aulas cooperativas multitarea del Centro de Formación Padre Piquer. Una perspectiva émica (página 179-184)