• No se han encontrado resultados

Práctica 4: puerta de establecimiento comercial con gráfico de explotación

A. ENUNCIADOS DE LAS PRÁCTICAS

A.4. Práctica 4: puerta de establecimiento comercial con gráfico de explotación

The second problem of the classification based on projection histograms is that it is too sensitive to non-rigid movements of the human body model. This problem

Figure 6.4: SP tracking with Kalman Filters. Trajectories of SP estimated over the blob (black color) and tracked with the Kalman Filter (white color)

has been addressed by constructing the Projection Probabilistic Maps (PPMs) with a supervised machine learning phase [139]. By means of manual annotation of the videos, the learning phase is able to build a model that represents the memory of the people’s appearance in each posture.

The probabilistic approach included in the PPMs allows us to filter the dis- tracting moving parts of the body (such as the arms and the legs, not important for posture classification), thus solving the above reported problem. In fact, due to the non-rigid motion of the human body, these parts are likely to distract the classifier, resulting in misclassification of the posture. It must be pointed out (and it will be clearer further on) that this probabilistic approach based on a learning phase shifts the problems to the completeness of the training set. We will demonstrate that, as soon as the training set contains a wide enough variety of samples, this approach achieves an average performance of 95% or above in correct classification.

Given a generic set of classes of postures C = {Ci}, i = 1, ..., K, the

probability to belong to the class Ci is:

P (CB= Ci|P hB) = P (P hB|CB = Ci)p(Ci) K P j=1 P (P hB|CB= Cj)p(Cj) (6.10)

The a-priori probabilities p(Ci) can be estimated with respect to the habits of the

observed person and the type of the supervised room (e.g., in a kitchen, people stay more often standing or sitting than laying down, whereas, in the case of a bedroom, it is more likely vice versa), or can be dynamically modified in accordance with the history of the blob B. For simplicity, we now assume p(Ci) equal for all the classes

Ciand independent from the blob B, that is:

p(Ci) =

1

K, i = 1, ..., K (6.11)

The conditional probability of having the histograms P hB assumed to be in the

are independent. Thus:

P (P hB|CB= Ci) = P (θB∧ πB|CB = Ci) =

= P (θB|CB = Ci) · P (πB|CB = Ci) (6.12)

If we assume the p(Ci) as a constant and neglecting the normalization factor, the

Eq. 6.10 is the similarity function that describes the similarity between the current silhouette and the model of the silhouette in the posture Ci.

In order to calculate Eq. 6.12, the similarity between each projection histogram and the correspondent model must be computed. In a previous work [140], we tested the efficacy of PPMs by computing, in a very simple way, an average dis- tance (arithmetic mean) between the current projection θB(or πB) and the models

given by the PPMs.

According to the probability theory and considering θ and π projection as vec- tor of approximately independent measures, the two terms of Eq. 6.12 can be computed as the probability of intersection of the events:

P (θB|CB= Ci) = P B x−1 T x=0 (θB(x)|CB= Ci)  = = Bx−1 Q x=0 P (θB(x)|CB= Ci) P (πB|CB= Ci) = P By−1 T y=0 (πB(y)|CB = Ci) ! = = By−1 Q y=0 P (πB(y)|CB= Ci) (6.13)

The probability distributions of the events P (θB(x)|CB = Ci) and P (πB(y)|CB=

Ci) are estimated through a supervised learning phase, during which two 2D func-

tions Θi(x, y) and Πi(x, y) for each class Ciare created as follows:

P (θ(x) = y|Ci) = Θi(x, y) ; P (π(y) = x|Ci) = Πi(x, y) (6.14)

where Θi(x, −) and Πi(−, y) are the probability distributions of θ(x) and π(y),

respectively, assuming to be in the class Ci and we will refer to them as PPMs

hereinafter.

The supervised learning phase for the construction of the above mentioned maps is performed exploiting a training set T S of Ti 2D blobs referred to the i-th

class T S = Bti , t = 1, ..., Ti, where Bitare blob masks defined similarly to

Eq. 6.1. For each Bitthe couple P hti = (θti(x), πit(y)) of projection histograms is computed as in Eq. 6.2. Then, we construct Θi(x, y) and Πi(x, y) as follows:

Θi(x, y) = 1 Ti · Ti X t=1 g θti(x), y ; (6.15) Πi(x, y) = 1 Ti · Ti X t=1 g x, πti(y). (6.16)

Figure 6.5: Example of PPMs compared with projection histograms (a), sparse PPMs (b), and dense PPMs (c) forStandingf rontal posture. Brighter colors correspond to higher

probabilities.

Please note that we cannot generate a PPM by either simply averaging each training set contribution, or using a Gaussian distribution, since the measures com- puted for each sample (i.e., for each class) are multimodal. Thus, the g(s, t) func- tion must take into account all the variations of the samples. A possible g(s, t) function could be:

g(s, t) = 

1 if s = t

0 otherwise . (6.17)

This function simply accumulates all the training set information without gen- eralization and it can be acceptable only if we have an almost infinite training set. In fact, also if the current histogram is very similar to those used during the learn- ing phase, the probability (computed as the product of Eq. 6.13) could be zero if only one bin has a value which has never occurred during the training and this is due to the sparseness of the PPMs.

Consequently, the function g(s, t) should be less “rough”. We adopted the following function:

g(s, t) = 1

|s − t| + 1 (6.18)

The number 1 at the denominator is inserted to avoid dividing by zero. Eventu- ally, the values of the maps obtained through Eq. 6.18 must be normalized to obtain probability distributions. A comparison between the maps created with these two g(s, t) functions is reported in Fig. 6.5(b)-(c).

Once the PPMs are created during the learning phase, at the testing stage the projection histograms obtained by each blob B are compared as described with the PPM of each class and the resulting posture for the blob B is the one that maximizes the conditional probability reported in eq. 6.10, i.e.:

postureB = arg max

i=1,...,KP (CB= Ci|P hB) (6.19)

Eventually, Fig. 6.6 shows a snapshot of the environment we used for training. Given a training video shot with a single person, we provide a simple manual annotation of the posture of the actor.

Figure 6.6: Snapshot of the training procedure of our system.