Estimación de tiempo por iteración para cada base de datos con mul-

Base de Datos Tiempo por iteración 300W Pública 20 segundos 300W Privada 24 segundos Menpo 27 segundos COFW 3 segundos

AFLW 69 segundos

Con unos tiempos como los estimados en la tabla 14 se pueden probar alternativas mucho más rápidamente. Por otro lado se podría probar a introducir nuevas etapas de entrenamiento sin que fuese tan costoso computacionalmente.

Para conseguir analizar mejor cada punto, se propone realizar el entrenamiento de distintos modelos de redes neuronales donde cada uno de ellos se centrase en una zona concreta de la cara. Para ello habría que entrenar varios modelos que solo observen una serie determinada de puntos del conjunto total. Mediante esta especialización se espera poder mejorar los resultados obtenidos con esta red general que predice todas las partes de la cara.

Por último se propone realizar un entrenamiento juntando distintas bases de datos ya que, en este trabajo, se ha adaptado la función de pérdidas para poder trabajar con falta de etiquetas y, por tanto, se podrían juntar bases de datos con distinto número de etiquetas.

Referencias

[1] Adrian Bulat and Georgios Tzimiropoulos. Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Re- sources. In Proc. Conference on Computer Vision and Pattern Recognition. IEEE, mar 2017.

[2] Xavier P. Burgos-Artizzu, Pietro Perona, and Piotr Dollar. Robust Face Land- mark Estimation under Occlusion. In Proc. Conference on Computer Vision

and Pattern Recognition, pages 1513–1520. IEEE, dec 2013.

[3] Xudong Cao, Yichen Wei, Fang Wen, Jian Sun, X Cao, Y Wei, F Wen, and J Sun. Face Alignment by Explicit Shape Regression. Int J Comput Vis, 2012. [4] Jiankang Deng, Qingshan Liu, Jing Yang, and Dacheng Tao. Multi-view, multi- scale and multi-component cascade shape regression. Image and Vision Com-

puting, 47:19–26, mar 2016.

[5] Haoqiang Fan and Erjin Zhou. Approaching human level facial landmark loca- lization by deep learning. Image and Vision Computing, 47:27–35, mar 2016. [6] Bharath Hariharan, Pablo Andrés Arbeláez, Ross B. Girshick, and Jitendra

Malik. Hypercolumns for object segmentation and ﬁne-grained localization. In

Proc. Conference on Computer Vision and Pattern Recognition. IEEE, 2014.

[7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proc. Conference on Computer Vision and

Pattern Recognition. IEEE, 2015.

[8] Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Jo- seph Pal, and Jan Kautz. Improving landmark localization with semi-supervised learning. In Proc. Conference on Computer Vision and Pattern Recognition. IEEE, 2017.

[9] Sina Honari, Jason Yosinski, Pascal Vincent, and Christopher J. Pal. Recom- binator networks: Learning coarse-to-ﬁne feature aggregation. In Proc. Confe-

rence on Computer Vision and Pattern Recognition. IEEE, 2015.

[10] Vahid Kazemi and Josephine Sullivan. One millisecond face alignment with an ensemble of regression trees. In Proc. Conference on Computer Vision and

Pattern Recognition, pages 1867–1874. IEEE, jun 2014.

[11] Marek Kowalski, Jacek Naruniec, and Tomasz Trzcinski. Deep Alignment Net- work: A convolutional neural network for robust face alignment. In Proc. Con-

ference on Computer Vision and Pattern Recognition. IEEE, 2017.

[12] Donghoon Lee, Hyunsin Park, and Chang D Yoo. Face Alignment using Cascade Gaussian Process Regression Trees. In Proc. Conference on Computer Vision

50 Referencias

[13] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional net- works for semantic segmentation. In Proc. Conference on Computer Vision and

Pattern Recognition. IEEE, 2014.

[14] Jiangjing Lv, Xiaohu Shao, Junliang Xing, Cheng Cheng, and Xi Zhou. A Deep Regression Architecture with Two-Stage Re-initialization for High Performance Facial Landmark Detection. In Proc. Conference on Computer Vision and

Pattern Recognition. IEEE, 2017.

[15] Shaoqing Ren, Xudong Cao, Yichen Wei, and Jian Sun. Face Alignment at 3000 FPS via Regressing Local Binary Features. In Proc. Conference on Computer

Vision and Pattern Recognition. IEEE, 2014.

[16] Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, and Christoph Bregler. Eﬃcient object localization using convolutional networks. In Proc.

Conference on Computer Vision and Pattern Recognition. IEEE, 2014.

[17] George Trigeorgis, Patrick Snape, Mihalis A Nicolaou, Epameinondas Antona- kos, and Stefanos Zafeiriou. Mnemonic Descent Method: A recurrent process applied for end-to-end face alignment. In Proc. Conference on Computer Vision

and Pattern Recognition. IEEE, 2016.

[18] Yue Wu and Qiang Ji. Robust Facial Landmark Detection under Signiﬁcant Head Poses and Occlusion. In Proc. Conference on Computer Vision and Pat-

tern Recognition. IEEE, 2015.

[19] Shengtao Xiao, Jiashi Feng, Junliang Xing, Hanjiang Lai, Shuicheng Yan, and Ashraf Kassim. Robust Facial Landmark Detection via Recurrent Attentive- Reﬁnement Networks. LNCS, 9905:57–72, 2016.

[20] Xuehan Xiong and Fernando De La Torre. Supervised Descent Method and its Applications to Face Alignment. In Proc. Conference on Computer Vision and

Pattern Recognition. IEEE, 2013.

[21] Jing Yang, Qingshan Liu, and Kaihua Zhang. Stacked Hourglass Network for Robust Facial Landmark Localisation. In Proc. Conference on Computer Vision

and Pattern Recognition. IEEE, 2017.

[22] Xiang Yu, Feng Zhou, and Manmohan Chandraker. Deep Deformation Network for Object Landmark Localization. In Proc. European Conference on Computer

Vision, may 2016.

[23] Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. Facial Land- mark Detection by Deep Multi-task Learning. In European Conference on Com-

puter Vision, pages 94–108, 2014.

[24] Shizhan Zhu, Cheng Li, Chen Change Loy, and Xiaoou Tang. Face Alignment by Coarse-to-Fine Shape Searching. In Proc. Conference on Computer Vision

[25] Shizhan Zhu, Cheng Li, Chen Change Loy, and Xiaoou Tang. Unconstrained Face Alignment via Cascaded Compositional Learning. In Proc. Conference on

In document Modelos profundos para estimación de puntos de interés en imágenes del rostro humano (página 60-64)