1.3. C´ ancer diferenciado de tiroides
1.3.5. Tratamiento inicial
We reformulated road extraction as a deep reinforcement learning problem. We have developed a system that can train an agent using reinforcement learning on satellite imagery to trace roads. We compared two reinforcement learning algorithm, deep Q-learning (DQN) and Asynchronous
Advantage Actor-Critic (A3C), and presented the preliminary results.
Although the current results are very limited, the agent shows the ability of tracing straight roads, turning left/right, and moving backwards at road dead-ends. This demonstrates the feasibility of using deep reinforcement learning in the road extraction task. Thus, further work will be done to improve the training performance, such as adding a step for recognizing road junctions. Also a proper termination condition needs to be specified, other than just counting the number of visited roads during training, such that the agent can perform inference on a unknown map that has no ground truth information.
Bibliography
[1] Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12):2481–2495, 2017.
[2] Meir Barzohar and David B Cooper. Automatic finding of main roads in aerial images by using geometric-stochastic models and estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7):707–721, 1996.
[3] Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, and David DeWitt. Roadtracer: Automatic extraction of road networks from aerial images. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4720–4728, 2018.
[4] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
[5] Gabriel J Brostow, Jamie Shotton, Julien Fauqueur, and Roberto Cipolla. Segmentation and recognition using structure from motion point clouds. InEuropean Conference on Computer Vision, pages 44–57. Springer, 2008.
[6] Alexander Buslaev, Selim S Seferbekov, Vladimir Iglovikov, and Alexey Shvets. Fully con- volutional network for automatic road extraction from satellite imagery. InCVPR Workshops, pages 207–210, 2018.
[7] Abhishek Chaurasia and Eugenio Culurciello. Linknet: Exploiting encoder representations for efficient semantic segmentation. In2017 IEEE Visual Communications and Image Processing (VCIP), pages 1–4. IEEE, 2017.
[8] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):834–848, 2017.
[9] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. InEu- ropean Conference on Computer Vision, pages 801–818, 2018.
[10] Guangliang Cheng, Ying Wang, Shibiao Xu, Hongzhen Wang, Shiming Xiang, and Chunhong Pan. Automatic road detection and centerline extraction via cascaded end-to-end convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing, 55(6):3322–3337, 2017.
[11] M Christoudias Christopher, M Christoudias, and P Meer. Synergism in low level vision. In Proceedings of the 16 th International Conference on Pattern Recognition (ICPR’02) Volume, volume 4, page 40150, 2002.
[12] Dorin Comaniciu and Peter Meer. Mean shift: A robust approach toward feature space analy- sis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):603–619, 2002.
[13] Wikipedia Contributors. File:VGG neural network.png — Wikipedia, the free encyclo- pedia. https://en.wikipedia.org/wiki/File:VGG_neural_network.png,
2019. [Online; accessed 10-November-2019].
[14] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Ro- drigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3213–3223, 2016.
[15] Timothee Cour, Stella Yu, and Jianbo Shi. Normalized cuts matlab code. Computer and Information Science, Penn State University. Code available at http://www. cis. upenn. edu/jshi/software, 2006.
[16] Ilke Demir, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, and Ramesh Raska. Deepglobe 2018: A challenge to parse the earth through satellite images. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 172–179. IEEE, 2018.
[17] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large- scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE, 2009.
[18] David H Douglas and Thomas K Peucker. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: the international journal for geographic information and geovisualization, 10(2):112–122, 1973.
[19] Francisco J Estrada, Allan D Jepson, and Chakra Chennubhotla. Spectral embedding and min cut for image segmentation. InBMVC, pages 1–10, 2004.
[20] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal- network.org/challenges/VOC/voc2012/workshop/index.html.
[21] Pedro F Felzenszwalb and Daniel P Huttenlocher. Efficient graph-based image segmentation. International journal of computer vision, 59(2):167–181, 2004.
[22] Timothy Forbes and Charalambos Poullis. Deep autoencoders with aggregated residual trans- formations for urban reconstruction from remote sensing data. In2018 15th Conference on Computer and Robot Vision (CRV), pages 23–30. IEEE, 2018.
[23] Hado van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. InProceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pages 2094–2100. AAAI Press, 2016.
[24] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for im- age recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
[25] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.
[26] Stefan Hinz and Albert Baumgartner. Automatic extraction of urban road networks from multi- view aerial imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 58(1-2):83–98, 2003.
[27] Jiuxiang Hu, Anshuman Razdan, John C Femiani, Ming Cui, and Peter Wonka. Road network extraction and intersection detection from aerial images by tracking road footprints. IEEE Transactions on Geoscience and Remote Sensing, 45(12):4144–4157, 2007.
[28] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[29] Allan Jepson and David Fleet. Image segmentation. lecture note in CSC2503 at University of Toronto http://www.cs.toronto.edu/~fleet/courses/2503/
fall11/Handouts/segmentation.pdf, 2011.
[30] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. InAdvances in neural information processing systems, pages 1097–1105, 2012.
[31] Hugo Larochelle, Yoshua Bengio, Jérôme Louradour, and Pascal Lamblin. Exploring strate- gies for training deep neural networks. Journal of machine learning research, 10(Jan):1–40, 2009.
[32] Xiaoyun Lei, Zhian Zhang, and Peifang Dong. Dynamic path planning of unknown environ- ment based on deep reinforcement learning. Journal of Robotics, 2018, 2018.
[33] Mengmeng Li, Alfred Stein, Wietske Bijker, and Qingming Zhan. Region-based urban road extraction from vhr satellite images using binary partition tree. International Journal of Ap- plied Earth Observation and Geoinformation, 44:217–225, 2016.
[34] Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1925–1934, 2017.
[35] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for seman- tic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3431–3440, 2015.
[36] David G Lowe et al. Object recognition from local scale-invariant features. InProceedings of the IEEE International Conference on Computer Vision, volume 2, pages 1150–1157, 1999.
[37] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. InProc. 8th Int’l Conf. Computer Vision, volume 2, pages 416–423, July 2001.
[38] David Martin and Charless Fowlkes. The berkeley segmentation dataset and benchmark.
http://www.cs.berkeley.edu/projects/vision/grouping/segbench/.
[39] Gellért Máttyus, Wenjie Luo, and Raquel Urtasun. Deeproadmapper: Extracting road topology from aerial images. InProceedings of the IEEE International Conference on Computer Vision, pages 3438–3446, 2017.
[40] Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 Fourth International Confer- ence on 3D Vision (3DV), pages 565–571. IEEE, 2016.
[41] Piotr Mirowski, Matt Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Andrew Zisserman, Raia Hadsell, et al. Learning to navigate in cities without a map. In Advances in Neural Information Processing Systems, pages 2419–2430, 2018.
[42] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep rein- forcement learning. In International Conference on Machine Learning, pages 1928–1937, 2016.
[43] Volodymyr Mnih and Geoffrey E Hinton. Learning to detect roads in high-resolution aerial images. InEuropean Conference on Computer Vision, pages 210–223. Springer, 2010.
[44] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
[45] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015.
[46] Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann ma- chines. InProceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807–814, 2010.
[47] Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, pages 1520–1528, 2015.
[48] OpenStreetMap contributors. Planet dump retrieved from https://planet.osm.org . https:
//www.openstreetmap.org, 2017.
[49] Urs Ramer. An iterative procedure for the polygonal approximation of plane curves.Computer graphics and image processing, 1(3):244–256, 1972.
[50] Amir Ramezani Dooraki and Deok-Jin Lee. An end-to-end deep reinforcement learning- based intelligent agent capable of autonomous exploration in unknown environments.Sensors, 18(10):3575, 2018.
[51] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[52] Franz Rottensteiner, Gunho Sohn, Markus Gerke, Jan Dirk Wegner, Uwe Breitkopf, and Jae- wook Jung. Results of the isprs benchmark on urban object detection and 3d building recon- struction. ISPRS Journal of Photogrammetry and Remote Sensing, 93:256–271, 2014.
[53] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei- Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
[54] Gerard Salton and Michael J McGill. Introduction to modern information retrieval. mcgraw- hill, 1983.
[55] Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. Prioritized experience replay. arXiv preprint arXiv:1511.05952, 2015.
[56] Jamie Sherrah. Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. arXiv preprint arXiv:1606.02585, 2016.
[57] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. Departmental Pa- pers (CIS), page 107, 2000.
[58] David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanc- tot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484, 2016.
[59] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[60] Suriya Singh, Anil Batra, Guan Pang, Lorenzo Torresani, Saikat Basu, Manohar Paluri, and CV Jawahar. Self-supervised feature learning for semantic segmentation of overhead imagery. InBMVC, page 102, 2018.
[61] Harshit Sinha. Deep reinforcement learning for sporadic rewards with human experience. In 2017 Second International Conference on Electrical, Computer and Communication Tech- nologies (ICECCT), pages 1–4. IEEE, 2017.
[62] Josef Sivic and Andrew Zisserman. Video google: A text retrieval approach to object matching in videos. In Proceedings of the IEEE International Conference on Computer Vision, page 1470. IEEE, 2003.
[63] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolu- tions. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1–9, 2015.
[64] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Re- thinking the inception architecture for computer vision. InProceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition, pages 2818–2826, 2016.
[65] Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, et al. Grand- master level in starcraft ii using multi-agent reinforcement learning. Nature, pages 1–5, 2019.
[66] Weixing Wang, Nan Yang, Yi Zhang, Fengping Wang, Ting Cao, and Patrik Eklund. A re- view of road extraction from remote sensing images. Journal of traffic and transportation engineering (english edition), 3(3):271–282, 2016.
[67] Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Fre- itas. Dueling network architectures for deep reinforcement learning. InProceedings of the 33rd International Conference on International Conference on Machine Learning, volume 48 ofICML’16, pages 1995–2003. JMLR.org, 2016.
[68] Jan Dirk Wegner, Javier Alexander Montoya-Zegarra, and Konrad Schindler. Road networks as collections of minimum cost paths.ISPRS Journal of Photogrammetry and Remote Sensing, 108:128–137, 2015.
[69] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition, pages 1492–1500, 2017.
[70] Pinjing Xu and Charalambos Poullis. Delineation of road networks using deep residual neural networks and iterative hough transform. In International Symposium on Visual Computing, pages 32–44. Springer, 2019.
[71] Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, and Amit Agrawal. Context encoding for semantic segmentation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7151–7160, 2018.
[72] TY Zhang and Ching Y Suen. A fast parallel algorithm for thinning digital patterns. Commu- nications of the ACM, 27(3):236–239, 1984.
[73] Zhengxin Zhang, Qingjie Liu, and Yunhong Wang. Road extraction by deep residual u-net. IEEE Geoscience and Remote Sensing Letters, 15(5):749–753, 2018.
[74] Lichen Zhou, Chuang Zhang, and Ming Wu. D-linknet: Linknet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 182–186, 2018.