DE CRÉDITO PREVISTAS
BASES UTILIZADAS PARA LA EVALUACIÓN DE LOS INGRESOS
Given an image sequence I1:T ={I1, I2, . . . , IT}, our segmentation task translates to finding
the target object’s contour Ct in each image It, yielding the contour sequence C1:T =
{C1, C2, . . . , CT}. Similarly, behavior recognition amounts to determining the action class
st which corresponds to the observed image It, yielding the action class sequence s1:T =
{s1, s2, . . . , sT}. The action classes that compose the behaviors under study belong to a
finite set S ={S1, S2, . . . , SM}. The different behaviors (and their component actions) are
by means of segmentation. Formally, this can be written as At= fA(It, Ct), where fA is a
function which associates to a given image It and contour Ct the corresponding extracted
attribute At.
In this context, we model the joint segmentation and behavior recognition problem using the Dynamic Bayesian Network shown in Fig. 3.4. In this figure, we have represented the model corresponding to two time slices — t− 1 and t — the dots implying that the DBN structure and parameters repeat in a similar fashion, starting from the first time slice, up to the one corresponding to the last image in the sequence that it models. Our model is based on coupling an HMM — whose hidden state at time t is given by the action class st
— with a probabilistic generative segmentation model, where the image It depends on the
contour Ctand the attribute At. The coupling of the two models at each time t is realized
through the attribute At. We represent observed variables by shaded nodes (the images
It, t = 1..T ) and hidden variables by clear nodes (the classes st, the attributes At and the
contours Ct, t = 1..T ). Moreover, we depict discrete variables by square nodes (the classes
st, t = 1..T ) and continuous variables by circular ones (the attributes At, the contours Ct
and the images It, t = 1..T ).
Figure 3.4— The Dynamic Bayesian Network supporting our joint segmentation / behavior recog-
nition framework. This model can be regarded as containing an HMM (in the upper half), coupled with a probabilistic segmentation model (in the lower half). For time slice t, the hidden state of
the HMM is given by action class st. Within the generative segmentation model, the image It is
dependent on the contour Ctand the attribute At. The observation at time t is given by the image
It. We depict hidden variables by clear nodes and observed variables by shaded nodes. The square
nodes designate discrete variables, whereas circular ones designate continuous variables.
joint variable distribution: P (I1:T, C1:T, A1:T, s1:T) = T Y t=1 P (It|At, Ct)P (Ct)P (At|st)P (st|st−1), (3.1)
where P (s1|s0)≡ P (s1) is the initial action class distribution. In the following, we explain
the assumptions underlying our model and we detail each of the probability factors from the right-hand side product in (3.1).
Our model relies on the first order Markov assumption, namely that the action class at time t only depends on the action class at time t− 1, being independent with respect to the action classes previous to time t− 1:
P (st|st−1, st−2, . . .)≡ P (st|st−1). (3.2)
The right-hand side of (3.2), which is part of our model (3.1), is considered to be independent of time, and thus definable in terms of the set of action class transition probabilities T = {tij}:
P (st= Sj|st−1= Si) = tij, i, j = 1..M, (3.3)
under the standard stochastic constraints:
tij ≥ 0 M X j=1 tij = 1. (3.4)
The initial action class distribution, corresponding to the action class of the first image in a sequence, is given by π ={πi}, with
πi = P (s1 = Si), i = 1..M. (3.5)
In order to incorporate the attributes At, the object contour Ct and the image It in
our probabilistic model, we need to treat those quantities as random variables. Within our DBN, which is founded on an HMM, this is achieved by defining the joint distribution of these variables given the action class st, that is, P (It, Ct, At|st). Directly working with
such a joint distribution is in general too complicated. The model can often be made more tractable by considering a simpler factorized distribution, where some of the dependencies between the variables are removed. In our model, we propose to use a joint distribution of the form
P (It, Ct, At|st) = P (It|At, Ct) P (Ct) P (At|st). (3.6)
In our framework, the attributes At represent the essential characteristics of the object
captured in image It, which are relevant for the recognition task. The prior knowledge we
have about these attributes, associated to a particular action class, is given by P (At|st),
which represents the probability of the attributes At given the action class st. Of course,
on the type of attributes that were chosen. Thus, we let the modeling of this probability constitute one of the “degrees of freedom” of our framework, to be performed according to the application at hand. For notation simplification, we denote the attribute probability given an action class Si by:
Pi(At) = P (At|st= Si). (3.7)
To support cooperation with the segmentation process, we only require that these probabili- ties be modeled by functions Pi(At) which are differentiable with respect to At. Examples of
models for this probability will be offered in Chapter 4, where we present implementations of our framework for particular applications.
The probabilities P (Ct) and P (It|At, Ct) in (3.1) constitute a probabilistic segmentation
model, that we will translate into a variational segmentation formulation. In this context, we would like to note that the object contour C is a continuous function, belonging to an infinite-dimensional space. Generally, the modeling of probability distributions on infinite- dimensional spaces is an open issue. Thus, in practice, we consider a finite-dimensional representation of the contour, obtained by sampling over a regular grid.
In our framework, the prior probability of the contour P (Ct) is another free parameter,
which gives us the possibility to include (application-dependent) a priori knowledge about the target object contour, which is independent of the action class. As we have seen in Chapter 2, a common choice for this probability in the variational segmentation community favors a short length |Ct| of the segmenting contour:
P (Ct)∝ e−ν|Ct|, ν > 0. (3.8)
Moreover, P (It|At, Ct) corresponds to a generative image formation model. This model
states that, given a set of prior attributes At and a prior contour Ct, an image It can be
obtained by sampling from the distribution P (It|At, Ct). In other words, this means that
we focus on the attributes and object contour only, and consider all the other properties of the image as resulting from random variations. The distribution P (It|At, Ct) represents the
probability of observing image It, given that Ctis the boundary of the object of interest and
At= fA(It, Ct) are the attributes extracted from the image via the function fA. Since fA
is a deterministic function of It and Ct, we need to give it a probabilistic interpretation
in order to be able to incorporate it into our model. A simple approach is to consider that the probability of observing an image It whose extracted contour is Ct and whose
extracted attributes Atare different from fA(It, Ct), is zero. Formally, this can be achieved
by defining
P (It|At, Ct)∝ δ(At− fA(It, Ct)) e−Eimage(It,Ct), (3.9)
where δ represents a Dirac distribution, which selects the images with the right attributes. Moreover, Eimage is a free parameter of our framework, given by a variational segmentation
energy, which expresses image-based constraints on the contour. It can be made up of any boundary- or region-based energy terms suitable for the application at hand (such as the ones adopted in [30] or [112], presented in Chapter 2). Denoting by Ω ⊂ R2 the image
assuming the values of the image feature values I(x, y) (which can be scalar or vectorial) to be independent and identically distributed samples of two independent random processes, corresponding to the object and background region, respectively:
Eimage(It, Ct) = Z Z ω− log P (It (x, y)|(x, y) ∈ ω) dx dy + Z Z Ω\ω− log P (It (x, y)|(x, y) ∈ Ω \ ω) dx dy. (3.10)
A common modeling choice for the region probabilities P (It(x, y)|(x, y) ∈ ω) and P (It(x, y)|(x, y) ∈
Ω \ ω) is the Gaussian distribution. Concrete modeling examples for the application- dependent parameters of our framework, i.e., P (At|st), P (Ct) and Eimage(It, Ct), will be
offered in Chapter 4.