• No se han encontrado resultados

Correction of the colour variations under uncontrolled lighting conditions

N/A
N/A
Protected

Academic year: 2020

Share "Correction of the colour variations under uncontrolled lighting conditions"

Copied!
269
0
0

Texto completo

(1)UNIVERSIDAD POLITÉCNICA DE MADRID. ESCUELA TÉCNICA SUPERIOR DE INGENIEROS DE TELECOMUNICACIÓN. TESIS DOCTORAL. C O R R E C T I O N O F T H E C O L O U R VA R I AT I O N S UNDER UNCONTROLLED LIGHTING CONDITIONS. juan torres arjona Ingeniero de Telecomunicación 2014.

(2) Juan Torres Arjona: Correction of the Colour Variations under Uncontrolled Lighting Conditions, © 2014 supervisor: Prof. José Manuel Menéndez García.

(3) D E PA R TA M E N T O D E S E Ñ A L E S , S I S T E M A S Y RADIOCOMUNICACIONES ESCUELA TÉCNICA SUPERIOR DE INGENIEROS DE TELECOMUNICACIÓN. C O R R E C T I O N O F T H E C O L O U R VA R I AT I O N S U N D E R UNCONTROLLED LIGHTING CONDITIONS. juan torres arjona Ingeniero de Telecomunicación 2014 supervisor Prof. José Manuel Menéndez García.

(4)

(5) Department:. Departamento de Señales, Sistemas y Radiocomunicaciones Escuela Técnica Superior de Ingenieros de Telecomunicación Universidad Politécnica de Madrid. PhD Thesis:. C O R R E C T I O N O F T H E C O L O U R VA R I AT I O N S U N D E R U N CONTROLLED LIGHTING CONDITIONS. Author:. Juan Torres Arjona Ingeniero de Telecomunicación (UPM). Supervisor:. Prof. José Manuel Menéndez García Doctor Ingeniero de Telecomunicación (UPM). Year:. 2014. Committee appointed by the Rector of Universidad Politécnica de Madrid on . . . . . . . . . . . . . . . . . . . Committee:. Prof. Guillermo Cisneros Pérez Universidad Politécnica de Madrid Prof. Eusebio Bernabéu Martínez Universidad Complutense de Madrid Prof. Luis Salgado Álvarez de Sotomayor Universidad Politécnica de Madrid Dr. Francisco Javier de la Portilla Muelas Centro Superior de Investigaciones Científicas Dr. Plinio Moreno López Universidade de Lisboa Dr. Luis Magdalena Layos European Centre for Soft Computing Dr. Marcos Nieto Doncel Vicomtech-IK4. After the defence of the PhD Thesis on . . . . . . . . . . . . . . . . . . . , at the Escuela Técnica Superior de Ingenieros de Telecomunicación, the committee agrees to grant the following grade: ................................................ PRESIDENT. MEMBERS. SECRETARY.

(6)

(7) ABSTRACT. This thesis discusses the correction methods used to compensate the variation of lighting conditions in colour image and video applications. These variations are such that Computer Vision algorithms that use colour features to describe objects mostly fail. Three research questions are formulated that define the framework of the thesis. The first question addresses the similarities of the photometric behaviour between images of dissimilar adjacent surfaces. Based on the analysis of the image formation model in dynamic situations, this thesis proposes a model that predicts the colour variations of the region of an image from the variations of the surrounded regions. This proposed model is called the Quotient Relational Model of Regions. This model is valid when the light sources illuminate all of the surfaces included in the model; these surfaces are placed close each other, have similar orientations, and are primarily Lambertian. Under certain circumstances, a linear combination is established between the photometric responses of the regions. Previous work that proposed such a relational model was not found in the scientific literature. The second question examines whether those similarities could be used to correct the unknown photometric variations in an unknown region from the known adjacent regions. A method is proposed, called Linear Correction Mapping, which is capable of providing an affirmative answer under the circumstances previously characterised. A training stage is required to determine the parameters of the model. The method for single camera scenarios is extended to cover non-overlapping multi-camera architectures. To this extent, only several image samples of the same object acquired by all of the cameras are required. Furthermore, both the light variations and the changes in the camera exposure settings are covered by correction mapping. Every image correction method is unsuccessful when the image of the object to be corrected is overexposed or the signal-to-noise ratio is very low. Thus, the third question refers to the control of the acquisition process to obtain an optimal exposure in uncontrolled light conditions. A Camera Exposure Control method is proposed that is capable of holding a suitable exposure provided that the light variations can be collected within the dynamic range of the camera. Each one of the proposed methods was evaluated individually. The methodology of the experiments consisted of first selecting some scenarios that cover the representative situations for which the methods are theoretically valid. Linear Correction Mapping was validated using three object re-identification applications (vehicles, faces and persons) based on the object colour distributions. Camera Exposure Control was proved in an outdoor parking scenario. In addition, several performance indicators were defined to objectively compare the results with other relevant state of the art correction and auto-exposure methods. The results of the evaluation demonstrated that the proposed methods outperform the compared ones in the most situations. Based on the obtained results, the answers to the above-described research questions are affirmative in limited circumstances, that is, the hypothesis of the forecasting, the correction based on it, and the auto exposure are feasible in the situations identified in the thesis, although they cannot be guaranteed. vii.

(8) in general. Furthermore, the presented work raises new questions and scientific challenges, which are highlighted as future research work.. viii.

(9) RESUMEN. Esta tesis trata sobre métodos de corrección que compensan la variación de las condiciones de iluminación en aplicaciones de imagen y video a color. Estas variaciones hacen que a menudo fallen aquellos algoritmos de visión artificial que utilizan características de color para describir los objetos. Se formulan tres preguntas de investigación que definen el marco de trabajo de esta tesis. La primera cuestión aborda las similitudes que se dan entre las imágenes de superficies adyacentes en relación a su comportamiento fotométrico. En base al análisis del modelo de formación de imágenes en situaciones dinámicas, esta tesis propone un modelo capaz de predecir las variaciones de color de la región de una determinada imagen a partir de las variaciones de las regiones colindantes. Dicho modelo se denomina Quotient Relational Model of Regions. Este modelo es válido cuando: las fuentes de luz iluminan todas las superficies incluídas en él; estas superficies están próximas entre sí y tienen orientaciones similares; y cuando son en su mayoría lambertianas. Bajo ciertas circunstancias, la respuesta fotométrica de una región se puede relacionar con el resto mediante una combinación lineal. No se ha podido encontrar en la literatura científica ningún trabajo previo que proponga este tipo de modelo relacional. La segunda cuestión va un paso más allá y se pregunta si estas similitudes se pueden utilizar para corregir variaciones fotométricas desconocidas en una región también desconocida, a partir de regiones conocidas adyacentes. Para ello, se propone un método llamado Linear Correction Mapping capaz de dar una respuesta afirmativa a esta cuestión bajo las circunstancias caracterizadas previamente. Para calcular los parámetros del modelo se requiere una etapa de entrenamiento previo. El método, que inicialmente funciona para una sola cámara, se amplía para funcionar en arquitecturas con varias cámaras sin solape entre sus campos visuales. Para ello, tan solo se necesitan varias muestras de imágenes del mismo objeto capturadas por todas las cámaras. Además, este método tiene en cuenta tanto las variaciones de iluminación, como los cambios en los parámetros de exposición de las cámaras. Todos los métodos de corrección de imagen fallan cuando la imagen del objeto que tiene que ser corregido está sobreexpuesta o cuando su relación señal a ruido es muy baja. Así, la tercera cuestión se refiere a si se puede establecer un proceso de control de la adquisición que permita obtener una exposición óptima cuando las condiciones de iluminación no están controladas. De este modo, se propone un método denominado Camera Exposure Control capaz de mantener una exposición adecuada siempre y cuando las variaciones de iluminación puedan recogerse dentro del margen dinámico de la cámara. Los métodos propuestos se evaluaron individualmente. La metodología llevada a cabo en los experimentos consistió en, primero, seleccionar algunos escenarios que cubrieran situaciones representativas donde los métodos fueran válidos teóricamente. El Linear Correction Mapping fue validado en tres aplicaciones de re-identificación de objetos (vehículos, caras y personas) que utilizaban como caracterísiticas la distribución de color de éstos. Por otra parte, el Camera Exposure Control se probó en un parking al aire libre. Además de esto, se definieron varios indicadores que permitieron comparar. ix.

(10) objetivamente los resultados de los métodos propuestos con otros métodos relevantes de corrección y auto exposición referidos en el estado del arte. Los resultados de la evaluación demostraron que los métodos propuestos mejoran los métodos comparados en la mayoría de las situaciones. Basándose en los resultados obtenidos, se puede decir que las respuestas a las preguntas de investigación planteadas son afirmativas, aunque en circunstancias limitadas. Esto quiere decir que, las hipótesis planteadas respecto a la predicción, la corrección basada en ésta y la auto exposición, son factibles en aquellas situaciones identificadas a lo largo de la tesis pero que, sin embargo, no se puede garantizar que se cumplan de manera general. Por otra parte, se señalan como trabajo de investigación futuro algunas cuestiones nuevas y retos científicos que aparecen a partir del trabajo presentado en esta tesis.. x.

(11) A mi primo Carlos. "Sólo el que sabe es libre y más libre el que más sabe. No proclaméis la libertad de volar, sino dad alas." "La verdadera ciencia enseña, por encima de todo, a dudar y a ser ignorante." — Miguel de Unamuno (1864–1936).

(12)

(13) It’s time to fly up to the sky. Against the wind, rise on fire. I see that everything wasn’t wrong. I’m back again, nothing was lost. Breathing new life, embrace my fate. Look around and believe again. — Angelus Apatrida (Reborn). AGRADECIMIENTOS. Después de una época sombría, llena de descuidos y de dudas, por fin llego a mi particular renacimiento; mi vuelta a la luz. Pero no ha sido solo cosa mía, puesto que en esta vida hay muy pocas cosas que uno pueda hacer solo. Por eso me gustaría dar las gracias a todas aquellas personas que han estado a mi lado durante todo este tiempo. Durante casi diez años, mi vida ha girado prácticamente alrededor de esta Tesis y hasta ahora me he encontrado con más penurias que alegrías. Por eso no puedo restringirme a agradecimientos profesionales, porque este trabajo ha sido muy personal y muchos son los que me han tendido la mano, cada uno a su manera. Pero me tendréis que perdonar, porque han sido tantas que será difícil expresar mi agradecimiento a todas las personas que se lo merecen. En primer lugar, yo no hubiera llegado aquí si no fuera por José Manuel. Ya son muchos años, desde que empezamos a ver más alla de lo visible hacia el comienzo de siglo. Durante este tiempo, he aprendido mucho de ti, me has ayudado a crecer profesionalmente y has facilitado que consiguiera terminar. ¡Gracias! También tengo que agradecer al resto de gente con la que empecé en esta aventura que se llama GaTV. Fede, Carlos Alberto, David, Nuria, Iago y Usoa, que aunque ya no estés en el grupo, para mí eres una más. Nos ha costado mucho llegar hasta aquí, aunque no todos saben apreciarlo como debieran. Aprendimos juntos lo que era un grupo de investigación, y hemos luchado mucho por sacarlo adelante. Gracias a ello he tenido la fortuna de desarrollarme como profesional y poder sacar esta Tesis adelante. No puedo dejar también de estar agradecido a la gente del grupo de Intelligent Imaging de TNO (Países Bajos) dirigidos por Ronald Kersten, donde desarrollé mi estancia en el extranjero, y que me trataron fenomenalmente. En gran medida son culpables de que haya terminado la Tesis. Especialmente: Klamer, Henri, jullie hebben ervoor gezorgd dat ik mijn enthousiasme voor onderzoek heb teruggevonden. Van jullie heb ik weer geleerd kritisch na te denken. Jullie hebben me ook laten zien hoe belangrijk het is hier plezier mee te hebben. Auke, ondanks dat we elkaar nog niet zo lang geleden hebben leren kennen en tot verschillende werelden behoren, zijn we heel snel goede vrienden geworden. Je zorgt ervoor dat de dingen makkelijk gaan. Je altijd aanwezige lach is aanstekelijk en vrolijkt iedereen op. Blijf zoals je bent!. I have also in mind Ninghang. My intern-ship was very friendly thanks to you. Uno de los motivos que me impulsaron a doctorarme fue lo que disfruté y lo satisfecho que me sentí durante mi proyecto fin de carrera. Por fin vi que con la formación que tuve durante la carrera se podían hacer cosas útiles, que para eso había estudiado. Pero en gran parte fue también gracias a la gente del Instituto del Patrimonio Cultural de España (entonces Instituto del Patrimonio Histórico Español) con los que colaborábamos: Marian, Miriam,. xiii.

(14) Maca, Tomás y Araceli. Esto me ha llevado a granjearme una buena amistad con ellos. Pero ha sido Araceli con la que he mantenido una relación más especial. Ya intentaste advertirme de lo difícil y poco recompensado que era doctorarse. A pesar de ello, siempre me has apoyado y ayudado. Eres un poco como una madre para mí y he aprendido mucho gracias a ti, cosas que he podido aplicar a mi trabajo aquí, pero también a mi vida personal. No me puedo olvidar de Carol. Aunque nuestros caminos no tengan el mismo destino, has sido una gran compañera de viaje, que soportaste todas las dificultades que entrañaba esta aventura con paciencia. También se merecen mi agradecimiento mis primos de Albacete: Gabri, Jorge, Paquito, Fran, Wences,. . . Siempre habéis confiado en mis capacidades y, aunque no haya conseguido aún un puesto en la NASA, me respetáis. No me puedo olvidar de Ruth, nos conocemos desde hace muchos años y, a pesar de la distancia, siempre he sentido tu apoyo, consejo y comprensión. La vida a veces te da sorpresas gratas e inesperadas. Ángel ha sido uno de esos casos raros. Conectamos desde el principio. Ser moteros, compartir esa bendita pasión y forma de vida, fue solo un motivo inicial para conocernos mejor. Rápidamente, nos dimos cuenta que compartimos muchos puntos de vista sobre la vida y nos entendemos a la perfección. Además, tú y Lorena tenéis un corazón que no cabe en toda Villanueva. Desde que nos conocimos, unos cuantos años ya, me habéis apoyado y alentado. Me habéis servido de paño de lágrimas y me habéis dado alas en los momentos más bajos. Y no sé si será porque los moteros somos gente especial, pero algo parecido me ha pasado con Morgan. Ya sabes que tenemos pendiente celebrar esto con algún GP y unas buenas rutas. Por muy lejos que se llegue, uno no puede olvidar de dónde viene. Mi origen es muy humilde y serrano. Y eso te forja un carácter determinado del que me siento orgulloso. Afortunadamente, sigo teniendo muchos lazos que me unen a él. No puedo olvidarme de mis amigos más antiguos, con los que he crecido y sé que tengo a mi lado cuando los necesito: Miguel y Samu. También tengo que agradecer todos estos años juntos a Raúl, que siempre me ha apreciado, apoyado y querido. Y por supuesto, tengo mucho que agradecer a Elisa, una hermana para mí y que me concedió una de las cosas de las que estoy más orgulloso: ser padrino de su hijo Álvaro. Tu apoyo ha sido siempre incondicional y me has ayudado a levantarme muchas veces, ¡gracias! A los míos, Kike, José, Ainhoa, Marga, Lucía y Silvia, que habéis estado en todo momento ahí. Me aprecíais tal y como soy, para lo bueno y para lo malo, y sabíais que al final terminaría esto. Ahora solo falta compartilo con vosotros, allá donde estéis. Porque no me habéis fallado y extraño cada momento que hemos pasado juntos. No os podéis imaginar todo lo que os lo agradezco. No tengo palabras. Y el último agradecimiento es para mi familia: tíos, primos, padres y hermana. Os lo debo todo y soy quien soy gracias a vosotros. Papá, mamá, con el tiempo he sabido apreciar lo valiosa que es la educación que me habéis dado. Os tengo que agradecer también los valores que me habéis transmitido: siempre me habéis mostrado que con trabajo, honradez, humildad y sacrificio se puede conseguir casi cualquier cosa. Nina, aunque eres la hermana pequeña, siempre me has protegido y has tratado de llevarme por el buen camino; supongo que porque nunca has confiado mucho en mi buen juicio. Pero en el fondo siempre me has apoyado, en todas y cada una de las decisiones que he tomado, aunque no las entendieras. Y por último os tengo que pedir perdón. Injustamente sois los que más me habéis sufrido. xiv.

(15) durante este tiempo. Sin embargo, habéis sido capaces de aguantar mis cambios de humor y muchas veces mi egoísmo, y sé que no ha sido nada fácil. Os quiero. A todos, os llevo en mi corazón y os estaré eternamente agradecido. Juan. xv.

(16)

(17) CONTENTS. i dissertation 1 introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Research questions and dissertation outline . . . . . . . . 2 image formation models 2.1 Light reflectance models . . . . . . . . . . . . . . . . . . . 2.2 The photometric perspective of a camera . . . . . . . . . 2.2.1 Light control module . . . . . . . . . . . . . . . . . 2.2.2 Sensor chip . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Digital signal processor . . . . . . . . . . . . . . . 2.2.4 Models used in radiometric calibration . . . . . . 2.3 Colour understanding . . . . . . . . . . . . . . . . . . . . 2.4 Dynamic model . . . . . . . . . . . . . . . . . . . . . . . . 3 prediction of the intensity variation of a region 3.1 Quotients relational model of regions . . . . . . . . . . . 3.1.1 Single region and single light source . . . . . . . 3.1.2 Photometric independent regions . . . . . . . . . 3.1.3 Intercorrelated regions . . . . . . . . . . . . . . . . 3.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . 3.3 Solution methods . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Outliers management . . . . . . . . . . . . . . . . 3.3.2 Multicollinearity . . . . . . . . . . . . . . . . . . . 3.3.3 Positive regressors . . . . . . . . . . . . . . . . . . 3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Statistics definition . . . . . . . . . . . . . . . . . . 3.4.2 Evaluation strategy . . . . . . . . . . . . . . . . . . 3.4.3 Terrace dataset . . . . . . . . . . . . . . . . . . . . 3.4.4 MUCT dataset . . . . . . . . . . . . . . . . . . . . 3.4.5 Parking dataset . . . . . . . . . . . . . . . . . . . . 3.4.6 MCDL dataset . . . . . . . . . . . . . . . . . . . . . 3.4.7 Results discussion . . . . . . . . . . . . . . . . . . 3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 photometric correction 4.1 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Proposed method in single cameras . . . . . . . . . . . . 4.2.1 Scenes modelling and training mode . . . . . . . 4.2.2 Runtime mode . . . . . . . . . . . . . . . . . . . . 4.3 Proposed method in non-overlapping cameras . . . . . . 4.3.1 Training mode . . . . . . . . . . . . . . . . . . . . 4.3.2 Runtime mode . . . . . . . . . . . . . . . . . . . . 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Performance indicators . . . . . . . . . . . . . . . 4.4.2 Evaluation strategy . . . . . . . . . . . . . . . . . . 4.4.3 Implementation of the reference methods . . . . . 4.4.4 Vehicle re-identification in outdoor parking . . . 4.4.5 Face re-identification . . . . . . . . . . . . . . . . . 4.4.6 People re-identification in corridors . . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. 1 3 4 5 6 9 10 12 15 16 18 19 20 23 25 25 28 30 30 33 34 35 35 36 36 37 39 40 55 56 72 96 97 99 100 103 104 105 106 106 107 107 108 109 111 112 114 115. xvii.

(18) xviii. contents. 4.4.7 Results discussion . . . . . . . . . . . . . 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . 5 exposure control during video acquisition 5.1 State of the art . . . . . . . . . . . . . . . . . . . . 5.1.1 The measurement of the light . . . . . . . 5.1.2 The processing of the indicators . . . . . 5.1.3 The actuation of the camera . . . . . . . . 5.2 Proposed method . . . . . . . . . . . . . . . . . . 5.2.1 Control variables . . . . . . . . . . . . . . 5.2.2 Algorithm . . . . . . . . . . . . . . . . . . 5.2.3 Actuation . . . . . . . . . . . . . . . . . . 5.3 Experiments . . . . . . . . . . . . . . . . . . . . . 5.3.1 Performance indicators . . . . . . . . . . 5.3.2 Evaluation strategy . . . . . . . . . . . . . 5.3.3 Results . . . . . . . . . . . . . . . . . . . . 5.3.4 Discussion . . . . . . . . . . . . . . . . . . 5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . 6 conclusions 6.1 Contributions . . . . . . . . . . . . . . . . . . . . 6.1.1 Dynamic Image Formation Model . . . . 6.1.2 Quotient Relational Model of Regions . . 6.1.3 Linear Correction Mapping . . . . . . . . 6.1.4 Camera Exposure Control . . . . . . . . . 6.2 Future work . . . . . . . . . . . . . . . . . . . . . ii appendices a notation conventions b changes in the albedo within c image datasets description c.1 Terrace dataset . . . . . . . . . . c.2 Parking dataset . . . . . . . . . c.3 MCDL dataset . . . . . . . . . . c.4 MUCT dataset . . . . . . . . . . d extended results d.1 Terrace dataset . . . . . . . . . . d.2 Parking dataset . . . . . . . . . d.3 MCDL dataset . . . . . . . . . . bibliography. . . . . . . . . 123 . . . . . . . . 124 127 . . . . . . . . 128 . . . . . . . . 129 . . . . . . . . 130 . . . . . . . . 131 . . . . . . . . 133 . . . . . . . . 133 . . . . . . . . 135 . . . . . . . . 139 . . . . . . . . 141 . . . . . . . . 141 . . . . . . . . 142 . . . . . . . . 143 . . . . . . . . 145 . . . . . . . . 150 151 . . . . . . . . 151 . . . . . . . . 151 . . . . . . . . 152 . . . . . . . . 152 . . . . . . . . 153 . . . . . . . . 154. a flat surface . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 157 159 163 165 165 166 168 172 173 174 185 191 227.

(19) LIST OF FIGURES. Figure 1.1 Figure 1.2 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Figure 2.9 Figure 2.10 Figure 2.11 Figure 2.12 Figure 3.1 Figure 3.2 Figure 3.3 Figure 3.4 Figure 3.5 Figure 3.6 Figure 3.7 Figure 3.8 Figure 3.9 Figure 3.10 Figure 3.11 Figure 3.12 Figure 3.13 Figure 3.14 Figure 3.15 Figure 3.16 Figure 3.17 Figure 3.18 Figure 3.19 Figure 3.20 Figure 3.21 Figure 3.22. The Machine output display from the TV series Person of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . Example of pictures obtained under different photometric conditions . . . . . . . . . . . . . . . . . . . . . Example of a HDR processing . . . . . . . . . . . . . . The IFM schema . . . . . . . . . . . . . . . . . . . . . . Reflection diagram over a Lambertian-specular surface Principle of the pinhole camera . . . . . . . . . . . . . The camera pipeline . . . . . . . . . . . . . . . . . . . The diagram of the modules of the Marlin colour camera . . . . . . . . . . . . . . . . . . . . . . . . . . . The lens pipeline . . . . . . . . . . . . . . . . . . . . . The sensor chip pipeline . . . . . . . . . . . . . . . . . The DSP pipeline . . . . . . . . . . . . . . . . . . . . . The colour processing unit pipeline . . . . . . . . . . The colour temperature chart . . . . . . . . . . . . . . A colour gamut example . . . . . . . . . . . . . . . . . Image formation process with two regions . . . . . . Diagram of the albedo in two points of a flat surface G CDF of Gres,i distribution . . . . . . . . . . . . . . . . . ii Gres,i Gii. function . . . . . . . . . . . . . . . . . . . . . . . . Terrace dataset samples . . . . . . . . . . . . . . . . . . Distribution of qf vs qb i per band using RAW pictures (Terrace dataset) . . . . . . . . . . . . . . . . . . . q̂f vs qf for RAW images of Terrace dataset. LS regression . . . . . . . . . . . . . . . . . . . . . . . . . . . Residuals histogram per band of the RAW pictures of Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . Distribution of qf vs qb i per band using JPEG pictures (Terrace dataset) . . . . . . . . . . . . . . . . . . . Residuals histogram per band of the JPEG pictures of Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . Distribution of qf vs qb i per band using the γ-JPEG pictures (Terrace dataset) . . . . . . . . . . . . . . . . . q̂f vs qf for the γ-JPEG images of Terrace dataset . . . Residuals histogram per band of the γ-JPEG pictures of Terrace dataset . . . . . . . . . . . . . . . . . . . . . Face samples of MUCT database . . . . . . . . . . . . Regions of MUCT database . . . . . . . . . . . . . . . Distribution of qf vs qb 1 per band of MUCT dataset . q̂f vs qf for MUCT dataset . . . . . . . . . . . . . . . Residuals histogram of MUCT dataset . . . . . . . . . Regions and locations of the Parking scene . . . . . . Distribution of qf vs qb 1 per band of the location #1 of Parking dataset . . . . . . . . . . . . . . . . . . . . . q̂f vs qf for location #1 of Parking dataset . . . . . . . Residuals histogram of location #1 of Parking dataset. LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3 4 9 10 11 13 14 15 15 17 18 21 22 23 26 26 29 31 41 42 43 46 47 50 51 52 54 55 55 57 59 60 61 62 63 66. xix.

(20) xx. List of Figures. Figure 3.23 Figure 3.24 Figure 3.25 Figure 3.26 Figure 3.27 Figure 3.28 Figure 3.29 Figure 3.30 Figure 3.31 Figure 3.30 Figure 3.31 Figure 3.32 Figure 3.33 Figure 3.34 Figure 3.35 Figure 3.36 Figure 3.37 Figure 3.38 Figure 3.39 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4 Figure 4.5 Figure 4.6 Figure 4.7 Figure 4.8 Figure 4.9 Figure 4.10 Figure 4.11 Figure 4.12 Figure 4.13 Figure 4.14 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Figure 5.6 Figure 5.7 Figure 5.8 Figure 5.9. Residuals histogram of location #1 of Parking dataset. LSM-R . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Distribution of qf vs qb 1 per band of the location #2 of Parking dataset . . . . . . . . . . . . . . . . . . . . . 68 Distribution of qf vs qb 1 per band of the location #3 of Parking dataset . . . . . . . . . . . . . . . . . . . . . 70 q̂f vs qf for location #3 of Parking dataset . . . . . . . 71 MCDL dataset samples. Scene #1 . . . . . . . . . . . . 73 MCDL dataset samples. Scene #2 . . . . . . . . . . . . 73 Regions and locations of MCDL dataset . . . . . . . . 74 Distribution of qf vs qb i per band (Camera #1 - Location #1 in MCDL dataset) . . . . . . . . . . . . . . . . 75 q̂f vs qf for camera #1 and location #1 of MCDL dataset 77 q̂f vs qf for camera #1 and location #1 of MCDL dataset 78 Residuals histogram per band of camera #1 and location #1 of MCDL dataset . . . . . . . . . . . . . . . . . 80 Distribution of qf vs qb i per band (Camera #1 - Location #2 in MCDL dataset) . . . . . . . . . . . . . . . . 82 q̂f vs qf for camera #1 and location #2 of MCDL dataset 83 Residuals histogram per band of camera #1 and location #3 of MCDL dataset . . . . . . . . . . . . . . . . . 86 Distribution of qf vs qb i per band (Camera #2 - Location #1 in MCDL dataset) . . . . . . . . . . . . . . . . 88 q̂f vs qf for camera #2 and location #1 of MCDL dataset 89 q̂f vs qf for camera #2 and location #2 of MCDL dataset 91 Residuals histogram per band of camera #2 and location #2 of MCDL dataset . . . . . . . . . . . . . . . . . 93 Distribution of qf vs qb i per band (Camera #2 - Location #3 in MCDL dataset) . . . . . . . . . . . . . . . . 95 Types of correction methods . . . . . . . . . . . . . . . 100 Indoor surveillance scene modelling . . . . . . . . . . 105 Example of intraclass and interclass normalised Cumulative Histograms . . . . . . . . . . . . . . . . . . . 110 Diagram of the evaluation strategy . . . . . . . . . . . 111 ROC curve for Parking scene . . . . . . . . . . . . . . . 113 People samples of MUCT dataset . . . . . . . . . . . . 114 ROC curve for MUCT dataset . . . . . . . . . . . . . . 116 People samples in MCDL dataset . . . . . . . . . . . . 117 Example of person correction in MCDL dataset . . . 118 Region setup used by the Liu et al.’s algorithm . . . . 118 Example of forced segmentation errors for the training 119 MCDL dataset. ROC curve for camera #1 . . . . . . . 119 MCDL dataset. ROC curve for camera #2 . . . . . . . 121 MCDL dataset. ROC curve for both cameras . . . . . 123 Comparison between an old and a modern camera . 127 Robert Cornelius’ self-portrait (1839) . . . . . . . . . . 128 The auto-exposure algorithms workflow . . . . . . . . 129 Variations of light metering patterns . . . . . . . . . . 130 Inverted T pattern for light metering. . . . . . . . . . . 130 Programmed exposure modes in Photography . . . . 132 Example of AE control algorithm . . . . . . . . . . . . 132 Indicators in intensities distribution . . . . . . . . . . 134 The CEC flowchart . . . . . . . . . . . . . . . . . . . . 136.

(21) List of Figures. Figure 5.10 Figure 5.11 Figure 5.12 Figure 5.13 Figure 5.14 Figure 5.15 Figure 5.16 Figure B.1 Figure C.1 Figure C.2 Figure C.3 Figure C.4 Figure C.5 Figure C.6 Figure C.7 Figure C.8 Figure C.9 Figure C.10 Figure D.1 Figure D.3 Figure D.4 Figure D.6 Figure D.7 Figure D.8 Figure D.9 Figure D.10 Figure D.11 Figure D.12 Figure D.13 Figure D.14 Figure D.15 Figure D.16 Figure D.18 Figure D.19 Figure D.21. Estimate of the variation of the exposure function in B vs Ef curve . . . . . . . . . . . . . . . . . . . . . . . . 138 The timeline of the CEC evaluation . . . . . . . . . . . 142 Visual comparison of AE methods . . . . . . . . . . . 144 Comparison of the evolution of the bright and contrast in a sunny day with clouds alternation . . . . . 145 Comparison of the evolution of the WSL and BSL in a sunny day with clouds alternation . . . . . . . . . . 147 Comparison of the evolution of the bright and contrast during unstable weather conditions . . . . . . . 148 Comparison of the evoltuion of the WSL and BSL during unstable weather conditions . . . . . . . . . . 149 Diagram of the albedo in two points of a flat surface 163 Terrace dataset samples . . . . . . . . . . . . . . . . . . 165 Regions of the Terrace scene . . . . . . . . . . . . . . . 166 Parking dataset samples . . . . . . . . . . . . . . . . . 167 Regions and locations of the Parking scene . . . . . . 167 MCDL dataset samples. Scene #1 . . . . . . . . . . . . 168 MCDL dataset samples. Scene #2 . . . . . . . . . . . . 168 People samples in MCDL dataset . . . . . . . . . . . . 170 Regions and locations of MCDL dataset . . . . . . . . 171 Face samples of the MUCT database . . . . . . . . . . 172 Regions of the MUCT database . . . . . . . . . . . . . 172 q̂f vs qf for RAW images of Terrace dataset . . . . . . 174 Residuals histogram per band of the RAW pictures of Terrace dataset (ext.) . . . . . . . . . . . . . . . . . . . . 176 q̂f vs qf for JPEG images of Terrace dataset . . . . . . 177 Residuals histogram per band of the JPEG pictures of the Terrace dataset (ext.) . . . . . . . . . . . . . . . . . 179 q̂f vs qf for γ-JPEG images of Terrace dataset . . . . . 180 Residuals histogram per band of the γ-JPEG pictures of Terrace dataset (ext.) . . . . . . . . . . . . . . . . . . 181 q̂f vs qf for location #2 of Parking dataset . . . . . . . 185 Residuals histogram of location #2 of Parking dataset. LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Residuals histogram of location #2 of Parking dataset. LSM-R . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Residuals histogram of location #3 of Parking dataset. LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Residuals histogram of location #3 of Parking dataset. LSM-R . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Residuals histogram per band of camera #1 and location #1 of MCDL dataset (ext.) . . . . . . . . . . . . . 191 q̂f vs qf for camera #1 and location #2 of MCDL dataset (ext.) . . . . . . . . . . . . . . . . . . . . . . . . . 192 Residuals histogram per band of camera #1 and location #2 of MCDL dataset . . . . . . . . . . . . . . . . . 193 Distribution of qf vs qb i per band (Camera #1 - Location #3 in MCDL dataset) . . . . . . . . . . . . . . . . 195 q̂f vs qf for camera #1 and location #3 of MCDL dataset196 Residuals histogram per band of camera #1 and location #3 of MCDL dataset (ext.) . . . . . . . . . . . . . 198. xxi.

(22) Figure D.22 Figure D.23 Figure D.25 Figure D.26 Figure D.27 Figure D.28 Figure D.30. q̂f vs qf for camera #2 and location #1 of MCDL dataset (ext.) . . . . . . . . . . . . . . . . . . . . . . . . . 199 Residuals histogram per band of camera #2 and location #1 of MCDL dataset . . . . . . . . . . . . . . . . . 200 Distribution of qf vs qb i per band (Camera #2 - Location #2 in MCDL dataset) . . . . . . . . . . . . . . . . 202 q̂f vs qf for camera #2 and location #2 of MCDL dataset (ext.) . . . . . . . . . . . . . . . . . . . . . . . . . 203 Residuals histogram per band of camera #2 and location #2 of MCDL dataset (ext.) . . . . . . . . . . . . . 204 q̂f vs qf for camera #2 and location #3 of MCDL dataset205 Residuals histogram per band of camera #2 and location #3 of MCDL dataset . . . . . . . . . . . . . . . . . 207. L I S T O F TA B L E S. Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 3.8 Table 3.9 Table 3.10 Table 3.11 Table 3.12 Table 3.13 Table 3.14 Table 3.15 Table 3.16 Table 3.17 Table 3.18 Table 3.19 Table 3.20. xxii. Correlation between BG regions in Terrace dataset . . Regressors estimates for RAW pictures of Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regressions statistics for the RAW pictures of the Terrace scene. Red Band . . . . . . . . . . . . . . . . . . . Regressors estimates for JPEG pictures of Terrace dataset Regressions statistics for the JPEG pictures of the Terrace scene. Red Band . . . . . . . . . . . . . . . . . . . Regressors estimates for γ-JPEG pictures of Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . Regressions statistics for the γ-JPEG pictures of the Terrace scene. Red Band . . . . . . . . . . . . . . . . . . Mean and variance of the normalised MSE result of the LSM over γ . . . . . . . . . . . . . . . . . . . . . . Regressors estimates of the MUCT dataset . . . . . . Regressions statistics for MUCT dataset . . . . . . . . Regressors estimates for location #1 of Parking dataset Regressions statistics for location #1 of Parking scene Regressions statistics for location #2 of Parking scene Regressions statistics for location #3 of Parking scene Correlation between BG regions in MCDL dataset (Camera #1 - Location #1). Green band . . . . . . . . . . . . Regressors estimates for location #1 of camera #1 of MCDL dataset. Green band . . . . . . . . . . . . . . . Regressions statistics for location #1 of camera #1 of MCDL dataset. Green band . . . . . . . . . . . . . . . Correlation between BG regions in MCDL dataset (Camera #1 - Location #2). Green band . . . . . . . . . . . . Regressions statistics for location #2 of camera #1 of MCDL dataset. Green band . . . . . . . . . . . . . . . Correlation between BG regions in MCDL dataset (Camera #1 - Location #3). Green band . . . . . . . . . . . .. 41 43 44 48 48 49 53 53 56 58 61 64 69 72 76 76 79 81 84 84.

(23) List of Tables. Table 3.21 Table 3.22 Table 3.23 Table 3.24 Table 3.25 Table 3.26 Table 3.27 Table 3.28 Table 3.29 Table 3.30 Table 4.1 Table 4.2 Table 4.3 Table 4.4 Table 4.5 Table 4.6 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table A.1 Table A.2 Table A.3 Table D.1 Table D.2 Table D.3 Table D.4 Table D.5 Table D.6 Table D.7 Table D.8 Table D.9 Table D.10. Regressors estimates for location #3 of camera #1 of MCDL dataset. Green band . . . . . . . . . . . . . . . 85 Regressions statistics for location #3 of camera #1 of MCDL dataset. Green band . . . . . . . . . . . . . . . 85 Correlation between BG regions in MCDL dataset (Camera #2 - Location #1). Green band . . . . . . . . . . . . 87 Regressors estimates for location #1 of camera #2 of MCDL dataset. Green band . . . . . . . . . . . . . . . 87 Regressions statistics for location #1 of camera #2 of MCDL dataset. Green band . . . . . . . . . . . . . . . 90 Correlation between BG regions in MCDL dataset (Camera #2 - Location #2). Green band . . . . . . . . . . . . 90 Regressors estimates for location #2 of camera #2 of MCDL dataset. Green band . . . . . . . . . . . . . . . 92 Regressions statistics for location #2 of camera #2 of MCDL dataset. Green band . . . . . . . . . . . . . . . 92 Correlation between BG regions in MCDL dataset (Camera #2 - Location #3). Green band . . . . . . . . . . . . 94 Regressions statistics for location #3 of camera #2 of MCDL dataset. Green band . . . . . . . . . . . . . . . 94 Error rate of correction in Parking scene . . . . . . . . 112 SaROC for Parking scene . . . . . . . . . . . . . . . . . 113 Error rate of correction in MUCT dataset . . . . . . . 115 SaROC for MUCT dataset . . . . . . . . . . . . . . . . 115 Error rate of cameras correction in MCDL dataset . . 120 SaROC for MCDL dataset . . . . . . . . . . . . . . . . 122 Tuning parameters for the CEC algorithm . . . . . . . 135 The values for the CEC settings used in the evaluation 143 Mean and standard deviation of the control variables 146 Rate time of unsuitable exposure and average distance to maximum contrast . . . . . . . . . . . . . . . 146 Mathematical syntax . . . . . . . . . . . . . . . . . . . 160 Physical magnitudes . . . . . . . . . . . . . . . . . . . 161 Symbol definitions and models . . . . . . . . . . . . . 162 Regressions statistics for the RAW pictures of the Terrace scenario. Green Band . . . . . . . . . . . . . . . . 182 Regressions statistics for the RAW pictures of the Terrace scenario. Blue band . . . . . . . . . . . . . . . . . 182 Regressions statistics for the JPEG pictures of the Terrace scenario. Green band . . . . . . . . . . . . . . . . 183 Regressions statistics for the JPEG pictures of the Terrace scenario. Blue band . . . . . . . . . . . . . . . . . 183 Regressions statistics for the γ-JPEG pictures of the Terrace scenario. Green band . . . . . . . . . . . . . . . 183 Regressions statistics for the γ-JPEG pictures of the Terrace scenario. Blue band . . . . . . . . . . . . . . . . 184 Estimated regressors for location #2 of Parking dataset 190 Estimated regressors for location #3 of Parking dataset 190 Correlation between BG regions in MCDL dataset (Camera #1 - Location #1). Red band . . . . . . . . . . . . . 209 Correlation between BG regions in MCDL dataset (Camera #1 - Location #1). Blue band . . . . . . . . . . . . . 209. xxiii.

(24) xxiv. List of Tables. Table D.11 Table D.28 Table D.30 Table D.12 Table D.13 Table D.14 Table D.15 Table D.16 Table D.17 Table D.18 Table D.19 Table D.20 Table D.21 Table D.22 Table D.23 Table D.24 Table D.25 Table D.26 Table D.32 Table D.33 Table D.34 Table D.35 Table D.36 Table D.37 Table D.38 Table D.39. Estimated regressors for location #1 of camera #1 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 209 Correlation between BG regions in MCDL dataset (Camera #2 - Location #1). Red band . . . . . . . . . . . . . 209 Estimated regressors for location #1 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 209 Estimated regressors for location #1 of camera #1 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 210 Regressions statistics for location #1 of camera #1 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 210 Regressions statistics for location #1 of camera #1 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 211 Correlation between BG regions in MCDL dataset (Camera #1 - Location #2). Red band . . . . . . . . . . . . . 211 Correlation between BG regions in MCDL dataset (Camera #1 - Location #2). Blue band . . . . . . . . . . . . . 212 Estimated regressors for location #2 of camera #1 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 212 Regressors estimates for location #2 of camera #1 of MCDL dataset. Green band . . . . . . . . . . . . . . . 212 Estimated regressors for location #2 of camera #1 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 213 Regressions statistics for location #2 of camera #1 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 213 Regressions statistics for location #2 of camera #1 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 214 Correlation between BG regions in MCDL dataset (Camera #1 - Location #3). Red band . . . . . . . . . . . . . 214 Correlation between BG regions in MCDL dataset (Camera #1 - Location #3). Blue band . . . . . . . . . . . . . 214 Estimated regressors for location #3 of camera #1 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 215 Estimated regressors for location #3 of camera #1 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 215 Regressions statistics for location #3 of camera #1 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 215 Regressions statistics for location #1 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 216 Regressions statistics for location #1 of camera #2 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 217 Correlation between BG regions in MCDL dataset (Camera #2 - Location #2). Red band . . . . . . . . . . . . . 217 Correlation between BG regions in MCDL dataset (Camera #2 - Location #2). Blue band . . . . . . . . . . . . . 217 Estimated regressors for location #2 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 218 Estimated regressors for location #2 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 218 Regressions statistics for location #2 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 219 Regressions statistics for location #2 of camera #2 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 220.

(25) Table D.40 Table D.41 Table D.42 Table D.43 Table D.44 Table D.45 Table D.46. Correlation between BG regions in MCDL dataset (Camera #2 - Location #3). Red band . . . . . . . . . . . . . 220 Correlation between BG regions in MCDL dataset (Camera #2 - Location #3). Blue band . . . . . . . . . . . . . 220 Estimated regressors for location #3 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 221 Regressors estimates for location #3 of camera #2 of MCDL dataset. Green band . . . . . . . . . . . . . . . 221 Estimated regressors for location #3 of camera #2 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 221 Regressions statistics for location #3 of camera #2 of MCDL dataset. Red band . . . . . . . . . . . . . . . . 222 Regressions statistics for location #3 of camera #2 of MCDL dataset. Blue band . . . . . . . . . . . . . . . . 223. LIST OF ALGORITHMS. Algorithm 5.1 Algorithm 5.2 Algorithm 5.3. Camera Exposure Control . . . . . . . . . . . . . . . . 137 Additional functions of Camera Exposure Control . . 138 Exposure variation functions of Camera Exposure Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140. xxv.

(26) ACRONYMS. ADC. Analog Digital Converter. AE. Automatic Exposure. AVI. Audio Video Interleave. AWB. Automatic White Balance. BG. background. BRDF. Bidirectional Reflectance Distribution Function. BSL. Black Saturation Level. BTF. Brightness Transfer Function. CCD. Charge-Coupled Device. CDF. Cumulative Distribution Function. CEC. Camera Exposure Control. CEF. Conditional Expectation Function. CFA. Colour Filter Array. CHD. Cumulative Histogram Distribution. CIE. Commission Internationale de l’Eclairage. CMF. Colour Matching Function. CMOS. Complementary Metal Oxide Semiconductor. CRF. Camera Response Function. CRT. Cathodic Ray Tube. DIFM. Dynamic Image Formation Model. DMT. Diagonal-Matrix Transform. DSLR. Digital Single Lens Reflex. DSP. Digital Signal Processor. EMD. Earth Mover’s Distance. EMoR. Empirical Model of Response. EV. Exposure Value. FG. foreground. FoV. Field of View. GigaE. Gigabit Ethernet. HDR. High Dynamic Range. xxvi.

(27) acronyms. HSV. Hue, Saturation, Value. HVS. Human Visual System. ICCM. Inter Camera Correction Mapping. ICCR. Inter Camera Colour Response. ICM. Illumination Correction Mapping. IFM. Image Formation Model. IIC. Inverse–Intensity Chromaticity. JPEG. Joint Photographic Experts Group. LCM. Linear Correction Mapping. LS. Least Squares. LSM-R. LS RR Method. LuT. Look-up Table. MCDL. Multi–Camera Dynamic Light. MEC. Minimal Error Criterium. MERC. Mean value of the Explanatory regions and Residuals Covariance. MKV. Matroska Multimedia Container. MPEG. Moving Picture Experts Group. MR. Mean value of the Residuals. MRC. Mean value of the Residuals Covariance. MSE. Mean Squared Error. NLS. Non-Linear Squares. NR. Noise range. OER. Overexposure range. PLS. Partial Least Squares. PCA. Principal Components Analysis. QRMR. Quotient Relational Model of Regions. RANSAC. RANdom SAmple Consensus. RGB. Red, Green and Blue. ROC. Receiver Operator Characteristic. RoI. Region of Interest. RR. Robust Regression. RRF. Radiometric Response Function. SaROC. Surface above the Receiver Operator Characteristic. xxvii.

(28) xxviii. acronyms. SIFT. Scale-Invariant Features Transform. SLR. Single Lens Reflex. SNR. Signal Noise Ratio. SoA. State of the Art. ST. Saturation Tolerance. SVR. Standard deviation of the Variances of the Residuals. USB. Universal Serial Bus. WB. White Balance. WSL. White Saturation Level.

(29) Part I D I S S E R TAT I O N.

(30)

(31) 1. INTRODUCTION. Computer Vision systems obtain information from images. These images are generated by cameras, which are theoretically able to provide information similar to that perceived by the human eye1 . While cameras have existed for centuries, it was not until the 1990s that Computer Visions algorithms arose due to the rapid evolution of the capabilities of computers. Currently, the number of cameras is growing at a very high rate due to their lower cost and their powerful potential applications. Therefore, the possibilities of the use of cameras are uncountable: surveillance; traffic and environment monitoring; entertainment and commercial information systems; and so on. As precursors of future technologies, TV series and movies highlight a large number of good examples of such future applications of cameras (Figure 1.1). Nevertheless, science fiction is actually far from reality in this field.. Figure 1.1: The Machine output display from the TV series Person of Interest. The Machine is a mass surveillance computer system capable of extracting and inferring any type of information from any surveillance device (based on Pedia of Interest2 ).. On the one hand, Computer Vision systems are not usually linked to each other; therefore, the required information to enable a holistic approach is sometimes missing. Although camera networks are broadly deployed around the world for a diverse set of purposes [99] and all types of images and videos are shared on the Internet, access to the sources of content is usually limited. Nevertheless, the problem is likely not scientific [48], but is commercial, social, or ethical in nature. Commercial interests and privacy issues set boundaries that must be respected. In addition, artificial intelligence algorithms intend to infer information in the manner that human beings do. Nevertheless, such algorithms still require large amount of development [33]. The great diversity of possibilities when acquiring an image of objects generates dissimilarities in their appearance. Such dissimilarities involve the 1 Considering only cameras operating on the visible range. 2 Pedia of Interest, Machine Point of View photos (consulted 08/2014) http://personofinterest.wikia.com/. Copyright of the original photographer or company and used under Fair Use as covered by Wikia’s licensing.. 3.

(32) 4. introduction. camera, scenario, poses, occlusions, illumination, and so on. This thesis addresses the problem of the variations that are produced in the images of the same objects caused by differences in the photometric3 conditions. 1.1. motivation. (a) Initial exposure.. (b) Different white balance.. (c) Reduced exposure.. (d) Increased exposure.. Figure 1.2: Example of pictures obtained under different photometric conditions by modifying the camera exposure settings. (a) Image taken using the correct exposure and colour. (b) Image taken with change in the illuminant of the scene, which produces colour variations. (c) and (d) show a change in the exposure; the effect is similar to a decrease and increase, respectively, in the intensity of the light source.. The appearance of an object within an image is determined by the way that the camera captures the interaction between light and the object. Changes in the photometric conditions affect the appearance of the object (Figure 1.2). This change in appearance is aligned with the observation of Finlayson et al. [30]: "in [ . . . ] tasks, such as object recognition, to digital photography, it is important that the colours recorded by a device are constant across a change in the scene illumination. As an illustration, consider using the colour of an object as a cue in a recognition task. Clearly, if such an approach is to be successful, then this colour must be stable across illumination change. Of course it is not stable, since changing the colour of the illumination changes the colour of the light reflected from an object." Thus, the photometric conditions influence the first stage of any Computer Vision system: the acquisition stage. If the appearance of the same object in multiple images is very different and this difference is not properly corrected, then the performance of the remaining stages (such as segmentation, classification, recognition, etc.) could not be guaranteed. 3 Photometry is the science of the measurement of the light visible to the Human Visual System (HVS). Although some cameras capture information from non-visible wavelengths, we only account for images captured in the visible spectrum. Nevertheless, throughout this thesis photometric as well as radiometric magnitudes are used. For more information regarding the differences between Radiometry and Photometry see [53, Chapter 2]..

(33) 1.2 objectives. The possible photometric variations are due to changes in: • The light sources: their intensity, number and type of illuminants. • The position, orientation and pose of the objects, camera and light sources. • The shape of the objects. • The photometric response of the camera: exposure settings or the camera properties. Thus, maintaining the same appearance of objects in several acquisitions is quite difficult to achieve. Two types of approaches are used to correct the effects of photometric variations: i) absolute methods, and ii) relative methods. Absolute methods do not require any reference from the previous captured images. Absolute methods base the correction on the image itself and on some information of the scenario. Nevertheless, the effectiveness of such methods is often limited because it is difficult to define a general method that functions under any situation. In addition, absolute methods are sometimes based on calibration stages, in which some knowledge of the illuminants is required; otherwise, the calibration is difficult to implement in a real environment. Relative methods use a suitable photometric condition as reference, which is established as a target. Next, the variation of the image related to this reference is estimated and then properly corrected. The complexity of these methods comes from the definition of the suitable reference and the measure of the variation. 1.2. objectives. The objectives of this thesis are described in the title of this dissertation: Correction of the Colour Variations under Uncontrolled Lighting Conditions. The main goal of this thesis is focused on studying and proposing image correction methods. Each correction method consists of changing an undesirable photometric behaviour with a suitable one. Thus, we seek the photometric invariant responses of the captured objects. These methods address colour variations, which involve changes in the spectral response of the lights and the surfaces that produce variations in the pixel intensities. Furthermore, these methods must operate under uncontrolled lighting conditions. In the scope of this thesis, uncontrolled lighting conditions mean that several issues related to the lighting of the scene are unknown. More specifically, previous knowledge of the type of light sources and their intensity is not expected. The same lack of previous knowledge is true of the type of colour distributions of the surfaces and their type. Furthermore, the position and orientation of the elements of the scene is also unknown. The exceptions are the non-moving regions of the Field of View (FoV) of the camera, which are analysed to tune up the methods. This analysis results in the correction being applied to scenes taken by static cameras, or at least those that have the same FoV. Both the absolute and the relative correction methods are considered. For all cases, simple calibration stages and fast and low-cost computational algorithms are achieved.. 5.

(34) 6. introduction. 1.3. research questions and dissertation outline. To accomplish the exposed objectives, two main problems arise: 1. The variation of the photometric responses of the objects cannot be computed because those that should be corrected are unknown. 2. The desired information of the objects could be irrecoverable from their image intensity if the camera sensor does not receive enough light or does receive too much light. Fortunately, during the acquisition process of static cameras, a large region of the captured image most likely does not change; therefore, its intensity variation can be calculated. The existence of this region depends mainly on the scenario and the application. For example, in surveillance applications, in which a large area of coverage of the cameras is usually required, large zones of the FoV where nothing moves are likely found. In addition, it seems reasonable to think that, under certain circumstances, the light variation of objects illuminated by the same light sources is similar. Therefore, the non-moving regions can be used to estimate the photometric variations of the unknown target objects. Furthermore, current cameras often have programming interfaces that allow for control of the acquisition process. Thus, these interfaces can be used to avoid undesirable captured images. The previous reasoning raises the following research questions: a) Can we forecast the changes in the intensity of a region using variations of the surrounded regions? b) Can we use the previous forecasting results to properly correct the photometric variations of unknown objects in a camera? Can we extend the correction to a multi-camera architecture? c) Can we ensure a well-exposed image capture by controlling its acquisition process? If so, under which conditions? To the best of our knowledge, the first two issues have not been proposed before. Although the third issue has been tackled by other authors, who proposed automatic methods that control the camera exposition and the colour management, it remains an unsolved question. The structure of this dissertation is established to answer the previous questions. First, Chapter 2 defines an Image Formation Model (IFM), which is used in the remainder of the dissertation. To achieve this definition, a complete study of the image acquisition pipeline is performed. The next three chapters answer the three research questions. Each chapter starts with a problem statement. A review of the State of the Art (SoA) methods related to this problem follows. Going far beyond the current work, we establish a proposal that addresses the problem. This proposal is later evaluated and, based on the generated results, analysed. Chapter 3 addresses the photometric relationships between adjacent regions and proposes a method that forecasts the variation of the intensity of a region using the adjacent ones. This method works under certain circumstances; the boundaries that are used are also provided. In Chapter 4, we propose an algorithm based on the previous analysis that is capable of correcting the photometric variations in unknown objects in a non-overlapping multi-camera architecture..

(35) 1.3 research questions and dissertation outline. Chapter 5 tackles the photometric correction from a different perspective. The algorithms analysed in this chapter control the acquisition process to hold an optimal exposure, avoiding under- and over-exposed captures. Finally, the last chapter (Chapter 6) establishes some conclusions that provide a reasoned response to the research questions while identifying the scientific contributions of this dissertation. In addition, this chapter identifies the future work that would extend these contributions; the research lines identified in this thesis are outside of the scope of the original objectives.. 7.

(36)

(37) 2. I M A G E F O R M AT I O N M O D E L S. (a) Darker image.. (b) Lighter image.. (c) Human-like perception.. Figure 2.1: The human eye and the cameras observe the world differently. Camera sensors are more limited as regards photometric sensitivity. (a) and (b) are images of the same scene obtained under different exposure settings. (c) is an image similar to the image perceived by a human. Under-exposed pixels in (a) and over-exposed pixels in (b) are correctly exposed in (c) using an image processing technique called HDR.. The images that humans perceive like a source of information of the world depends on how the light interacts with it. Cameras are devices that capture these interactions and transform the incoming light into electronic signals. These signals, accordingly processed and presented, represent a limited version of what the eyes can see1 . This limitation reduces the suitable photometric conditions where a picture is well-exposed, although an eye can correctly perceived the scene (Figure 2.1). Computer Vision algorithms are also less robust under light variations than the human brain. For those algorithms that use several samples of the same object (tracking, 3D reconstruction, recognition, re-identification, mosaicing, etc.), light changes are still a problem further to be solved. Before thinking of designing the Computer Vision algorithms that handle light variations, it is important to study the image formation process (Figure 2.2). To form a picture of a scene from the visible spectrum, at least one light source is required2 . The emitted light hits the objects and, depending on the properties of their surfaces, is reflected, refracted or scattered getting different alterations. The produced changes define the surfaces and the scene geometry. These interactions are described via multiple models, usually called light reflectance models. They are introduced in Section 2.1.. 1 For our purpose, this limitation is mainly related to the dynamic range. In the HVS, the dynamic range is several orders of magnitude greater than in any digital camera. Nevertheless, in aspects like sensitivity to spectral bands, digital cameras provide information that HVS cannot perceive [105, 109]. Using the camera obscure fundamentals, cameras are even able to extract information not directly seen in the scene [103]. 2 Except for some kind of fluorescent surfaces that are not addressed in this thesis.. 9.

(38) 10. image formation models. S. C B n̂c. Es. Lo Le n̂o. n̂e. n̂s. n̂v. Ld O. Figure 2.2: The image formation process. The radiation coming from light source S is reflected by the object O, creating an image of it that is captured by the camera C that forms a digital image B.. When the light of a scene reaches a digital camera, it goes through a lens that transforms the light into an electronic signal in the camera sensor. The camera implements several processing algorithms before the final digital image is formed. Section 2.2 addresses this process. Its knowledge enables to understand how the light variations affects the intensity values of the image. Colour is a visual perception that deserves special attention. Such as any perception, colour is subject to multiple interpretations. Indeed, a part of the visual Psychophysics science [26] is in charge of studying the relations between the physical measurements of the stimulus and the sensations that they produce. A colour image includes more information than a grey scale image. The processing of this extra information and its subjective nature increase the complexity of the IFMs. Section 2.3 provides some insight on colour understanding. Considering the goal of the correction of the light variations, in Section 2.4 we define a dynamic IFM that accounts the variations produced in the digital image when the light conditions change. 2.1. light reflectance models. The knowledge of the illumination and the design of the lighting is determinant for the successful of any image capture process. In this thesis, the relevant issue is that any light source emits an electromagnetic radiation composed by photons that interacts with the matter. This radiation is defined by its irradiance E [53, Section 2.3], as the power of electromagnetic radiation received per unit area. The magnitude of the irradiance generally varies along wavelength. When the irradiance hits a surface, another radiation is emitted into a given solid angle. This radiation is called radiance L. The magnitude of the radiance generally varies along wavelength and depends on the irradiance and the surface properties. The function that quanti-.

(39) 2.1 light reflectance models. tatively describes the ratio between the incident irradiance to the reflected radiance of a surface is called Bidirectional Reflectance Distribution Function (BRDF). The BRDF is function of the incident and the reflected directions. There are three types of surfaces depending on the types of reflections of the light: 1. Lambertian: reflects the light uniformly in all directions. The outgoing light is called diffuse or scattered. 2. Specular: reflects the light in one direction following the laws of reflection. 3. Fluorescent: emits photons by an external radiation that excites its ions, molecules, or atoms. Most of the real world surfaces are a combination of Lambertian and specular types. Fluorescent surfaces are not common and are not addressed in this thesis. Refracted light is also produced when the light goes through the surface. This effect is typical of the changes of the transmission medium and is not addressed neither in this thesis.. n̂o. n̂v n̂e. Es n̂s Ld. θd. Le. θe. ρo. Figure 2.3: Radiation components and angles in the reflection phenomenon over a combination of Lambertian and specular surface.. The Lambertian surfaces takes its name from Lambert [62] who formulated the Lambert’s cosine law in 1760. The law says that the outgoing light from a diffuse surface is directly proportional to the irradiance Es (Figure 2.2) in the input and the cosine of the angle θd (also called albedo or foreshortening factor) between the surface normal n̂o and the light source direction n̂s (Figure 2.3). This law may be expressed regarding the radiance Ld as: Ld = Es ρo cos θd = Es ρo n̂s · n̂o. (2.1). where ρo is the BRDF, constant for all directions for these surfaces. The BRDF is also called surface reflection coefficient or photometric response. In 1975, Phong [82] introduced an additive diffuse term to this equation caused by the ambient light created by the inter-reflections. This term is called Phong’s shading. Nevertheless, unless the irradiance is low, the value of this term is often negligible. The knowledge about the specular reflections are even older than the diffuse reflections. Hero of Alexandria (AD 10–70) [44] set the basis of the laws of reflection. This type of reflection is also called highlight. There are several models for the specular reflection. Zhang et al. [121] provided a survey of. 11.

(40) 12. image formation models. the most representatives. Phong’s model is one of the most popular in the literature. This model represents the specular reflection as powers of the cosine of the angle θe between the viewing direction n̂v and the specular direction n̂e (Figure 2.3): Le = Es ρo (cos θe )kep. (2.2). where θe = cos−1 (n̂v · n̂e ). The specular direction can be expressed as n̂e = 2 (n̂o · n̂s ) · n̂o − n̂s . As regards kep , more specular surfaces yields larger exponents. Torrance and Sparrow [104] (1967) presented a more accurate model based on the idea of surfaces composed by randomly distributed mirrors, which can be modelled via a Gaussian function: Le = Es ρo e−(ket θe ). 2. (2.3). where θe is the same as Equation 2.2 and ket has a similar interpretation than kep . In the common surfaces, those composed of diffuse and specular properties, the combined radiance is the weighted sum of both components: L = Kd Ld + (1 − Kd ) Le. (2.4). with 0  Kd  1 depending on the surface. Considering Equation 2.3, Equation 2.4 yields:  2 L = Es ρo Kd cos θd + (1 − Kd ) e−(ket θe ). (2.5). In Computer Vision, the specular reflections are undesirable because they often produce over-exposed local pixels of the camera sensor even though most of the pixels are well-exposed. As Section 2.2 explains, these reflectance models turn invalid for over-exposed pixels. In 1998, Wolff et al. [117] added other types of reflections produced in rough specular surfaces. He therefore improved the classic reflectance models by separating the diffuse models for rough and smooth surfaces. The difference between them is basically that the power of the rough surface reflection is not equally distributed in every direction; rather, it can be seen such as a lobule oriented in the specular direction, having less magnitude than the specular component. Other approaches use these models for rendering purposes, e. g., via multiplexing the components of several light sources [91]. In Section 2.4, we use similar concepts to collect several light contributions at the same time. The study of the lightning and the reflectance models is overall important in Computer Graphics. However, this introduction is sufficient in this thesis because we are more interested on how the light that reaches a camera is than how the interactions between light and the surfaces can be simulated. Further reading and discussion can be found in [53, Chapter 2] and [98, Chapter 2]. 2.2. the photometric perspective of a camera. The camera fundamentals are known since the ancient Chinese [77] and Greeks [12]. The pinhole camera (or camera obscura) that they used is still a reference model in Computer Vision nowadays. The pinhole camera model assumes that the light rays come through a small aperture and are projected.

(41) 2.2 the photometric perspective of a camera. Figure 2.4: Principle of the pinhole camera. The light from the real world comes through a small hole and forms an inverted image (based on Wikipedia3 ).. in a screen forming an inverted image of the outside (Figure 2.4). The human eye follows the same operation as the pinhole camera. Stockham [97] (1972) studied this parallelism between the representation of images within a camera and the HVS. More real models account that the light traces more complex paths inside the lens. These models are mainly relevant in Photogrammetry, which requires reliable measurements of the distances within the images. Due to our interest is only on the light intensities, we use the simpler pinhole camera principle. Until Computer Vision arose, cameras and image processing techniques were designed to get suitable representations of the real world for the HVS. The applications capable of extracting information from the images, beyond the visual perception, may change the suitable way that the images are formed and processed. The evolution of the surveillance cameras is an example of this change. At the beginning, these cameras were used for monitoring remote locations under the supervision of human beings. Thus, when a surveillance camera network was designed, the visual perception of the highest number of locations was often the priority. As a consequence, these cameras implemented compression algorithms, which degraded the quality of the images but increased the number of monitored locations and minimised the requirements of the bandwidth. These cameras had also low cost sensors that provided understandable but noisy images and whose dynamic range was very limited. When Computer Vision advancements enabled to extract valuable information for surveillance tasks, such as the presence of intruders or the detection of abnormal behaviours, these cameras became inappropriate. Thus, better sensors and low rate compression algorithms were built within the cameras. This change implies that the photometric requirements, and also the adopted IFMs, depend on the camera and the image processing technique. In this thesis, we make assumptions that are not valid for every situation. Thus, we clarify the validity of each established assumption. Depending on its functionality and usability, several types of cameras are identified. Besides the mentioned surveillance cameras, there are: professional digital cameras for Computer Vision, TV cameras, Digital Single Lens Reflexs (DSLRs), compact cameras, camcorders, webcams, and so on. A further reading is shown in [52, Chapter 8]. 3 Wikipedia, Pinhole camera (consulted 01/2014) http://en.wikipedia.org/wiki/Pinhole_camera. 13.

(42) 14. image formation models. Cameras provide images or videos. However, nowadays, the most of the cameras provide both. In this thesis, a video is a set of temporally consecutive images. The photometric camera response is commonly known as Camera Response Function (CRF). Radiometric calibration is the Computer Vision area in which the goal is to estimate the CRF (also known as Radiometric Response Function (RRF)) that maps the irradiance at the camera input to pixel intensities. We revisit the models that are used by the techniques that compute the CRF in Section 2.2.4. The techniques themselves are analysed in Section 4.1. Before that, we explain the architecture of a camera and the physical phenomenons produced inside. Light control. Lo. Lens. Sensor chip. {Ei }. Photosensor. Amplifier. {Schip,i } ADC. Exposure control. Digital Signal Processor. B(u, v). Image wrapping. Compression. Signal adaptation. Colour Processor Unit. Figure 2.5: The camera pipeline.. In Figure 2.5, we depict a diagram of the general camera pipeline based on the models developed by Healey and Kondepudy [43], Jacobson et al. [52], Tsin et al. [111] and Szeliski [98]. Neither every camera has all the modules nor the modules are presented in this order, but every possible module usually fits this model. The model is composed by the following parts: light control. The part composed by the lens and the mechanism of the control of the exposure (Section 2.2.1). sensor chip. In charge of collecting the photons and transform them into a digital signal (Section 2.2.2). digital signal processor (dsp). It adjusts the digital signal to the desired image about colour, quality, and so on (Section 2.2.3). In Figure 2.6, the camera pipeline of a commercial camera for Computer Vision (model Marlin AVT) is shown as a real example. The diagram practically includes every module of Figure 2.5 except the lens, which is a separate component; and the compression module, which does not exist in this type of cameras. The shading correction module corrects the fixed pattern noise (Section 2.2.2). Modules 4, 11–13 correspond to colour operations (Section 2.3). The diagram also includes some modules that are particular implementations (modules 7–10). For a complete explanation of each module, see [1]..

(43) 2.2 the photometric perspective of a camera. Figure 2.6: The diagram of the modules of the Marlin colour camera (source [1]).. 2.2.1. Lo. Light control module. Lens (ηlens ) GD. Exposure control (T , N). VIG. Esensor. VIG. GD: Geometric Distortion; VIG: Vignetting Figure 2.7: The lens pipeline.. The first element is the lens. This element guides the radiance from the real world Lo to the photoplane. The radiance is attenuated a factor of ηlens (optical transmittance). Geometric distortion is introduced in the lens due. 15.

Referencias

Documento similar

indication of the location of the distractor, indication of the cause for the ADO, specification of the used dis- traction sequence (latency, activation and consolidation

“State Caracteristics and the Location of Foreign Direct Investment within the United States”, Review of Economics and Statistics, Vol.. “The Location Determinants of Direct

Scarce previous data on how the location where an emotional stimulus appears in the visual scene modulates its perception suggest that, for functional reasons, a per- ceptual

These pictures include images of the parking spaces reserved for people with disabilities and a general view of the accessible equipment available in the beach (shaded area,

For calculating of the shallow geothermal potential with G.POT method for location 1 and 2, the BHE length required for heating and cooling modes to cover the H&C energy

In this context, the Parking Slot Assignment Problem is the math- ematical problem that finds assignments of carriers to parking places that satisfy their time windows requests..

No obstante, como esta enfermedad afecta a cada persona de manera diferente, no todas las opciones de cuidado y tratamiento pueden ser apropiadas para cada individuo.. La forma

The program is based on the following pillars: (a) students become the main agents of their learning by constructing their own knowledge; (b) the teacher must be the guide to