Matriz de identificación de riesgos por puesto de trabajo PERFILES

4. ANÁLISIS, INTERPRETACIÓN Y DISCUSIÓN DE RESULTADOS

4.2. MATRIZ DE INDENTIFICACIÓN DE RIESGOS

4.2.1. Matriz de identificación de riesgos por puesto de trabajo PERFILES

The present thesis proposes a novel and highly efficient probabilistic framework for the integration of one or more soft biometric traits in any biometric recognition system, by taking advantage of the error induced by the system when measuring each soft biometric characteristic.

At this point it should be emphasized, that, similarly to [220] and [6] and contrary to [222], no fusion between the conventional biometric recognition score and the soft biometric matching score takes place. As such, there is no need for computation of a soft biometric score, weighting functions or posterior probabilities.

Let Ω be the set of all identities in the M-sized user population Ω =

{ω1, ω2, . . . , ωM},xcbe the hard biometric information (e.g. geometric gait)

andxsbe a continuous soft biometric trait (i.e. the height of the user) from

a setX withN available soft biometricsX ={xs1, xs2, . . . , xsN}. As such,

p(ω|xc) = 1−p(¯ω|xc) is the matching score of the conventional biometric

system.

Partitioning the feature space

In both previous referenced works [220] and [6], the boosting is only augmenting the final recognition performance when applied to specific user

groups. Specifically, in [220], users are categorized into “minority” and

“majority” groups, according to the frequency of appearance of their soft biometric traits, while in [6], only extreme cases of soft biometric traits are boosted. Another drawback of these approaches lies in the fact, that an “extreme” or “minority” case is only defined in a single dimension. This leads to a uniform, linear quantization of the feature space, which is not the case in most real scenarios.

Contrary to the simple 3-stage partitioning (i.e. small-, normal- and

large-sized population) [6] [220], a more sophisticated spatial partitioning

of the feature spaceF inNC clusters Ci, that exhibit notable variation in

terms of their defining soft biometrics, is proposed herein. This way, the authentication probability of a client user is augmented, when the incoming soft biometric traits refer to the same cluster as the claimed ID. In all other

cases, the matching probability p(ω|xc) remains untouched and is solely based hard biometric trait of the user. Although there is no actual limitation

in the dimensionality of the clusters, the simple case of 2D clusters will be

studied herein, without loss of generality.

In this respect, a clusterCi of the multidimensional soft biometric feature

space, associated to a subsetSi ∈Ωof the set of identitiesΩ, is characterized

as a valid cluster iff the following hold.

In particular, the a-priori probability of an identity ω to belong to a

cluster has to be low:

0< p(ω ∈Ci|xs1, . . . , xsN) =pi(ω)<<1

and there should exist a subsetSi of Ω, so as

∃Si ⊂Ω (

∀ω∈Si, p(xs∈Ci|ω)> α

∀ω /∈Si, p(xs∈Ci|ω)≈pi(ω)

whereα is a minimum non-zero value, C is the union of all clusters whose

number NC should be significantly lower than the size |Ω| of the identity

set Ω:

C =∪Ci, ∀i= 1, . . . , NC

NC <<|Ω|

Three different partitioning alternatives have been implemented herein, all of which fulfil the requirements of a cluster:

Orthogonal Grouping (OG): A linear and possibly the simplest way of clustering the feature space its partitioning into uniform orthogonal clusters (Figure 5.3a). This kind of partitioning has been proposed in [220] and [6]. Using a brute force iterative algorithm on an adequately large reference soft biometric feature dataset, the dimensions of the prototype orthogonal cluster can be optimally defined. However, the major drawback of the current clustering method, as it has been implemented in [220] and [6], is the fact that it does not consider combined extreme cases of soft biometrics. On the contrary, it deals with each biometric feature separately. This way, some clusters are expected to be “left empty”, while others will possibly be “overcrowded”.

Figure 5.3.: a) Orthogonal Cluster - b) Hexagonal Cluster - c) Gaussian Cluster

Hexagonal Cell Grouping (HCG): A more efficient alternative for clustering the feature space is to partition it into adjacent identical hexagonal cells (Figure 5.3b). This way, isotropy is preserved along the whole feature space, while increased nonlinearity introduced, compared with the orthogonal grouping in Section 5.2.1.

In this case, the only parameter that has to be estimated and optimized is the hexagon’s radius. Similarly, to the orthogonal grouping case, this can be experimentally specified on an adequately large reference soft biometric feature dataset of the same dimensionality. Hereby, all soft biometric data at each dimension have to be normalized using their corresponding standard deviation before being assigned to the hexagonal cluster, in order to conform to the isotropy of the current grouping.

Similarly to Section 5.2.1, neither the issue of possible empty nor that of overcrowded clusters is solved hereby, despite the increased non-linearity. Gaussian Grouping (GG): Theoretically, the less linear the clustering is the more efficiently it will cover the feature space. In this respect, creat- ing multidimensional gaussian clusters on the feature space is expected to provide increased flexibility in grouping similar users.

To this direction, an unsupervised clustering approach is utilized. Ini- tially, the optimal number of clusters is estimated by utilizing the ISODATA

clustering algorithm [223]. Then again, by exploiting the expectation-

maximization (EM) algorithm [224], the soft biometric feature space can be

easily described as a mixture of multidimensional (Figure 5.3c) Gaussian, whereby each Gaussian is described as

N(fω|µk,Σk) =

(2π)Z/2_|_Σ

k|1/2

e−12(fω−µk)TΣ−k1(fω−µk) _(5.4)

Vectorfωincludes the soft biometric trait values,fω ={xsn,1(ω), . . . , xsn,Z(ω)},

whileµkand Σk are the Z-dimensional mean vector and the andZ×Z co-

variance matrix of thekth Gaussian, respectively.

At the authentication stage, the assignment of a user’s incoming soft biometric feature vector to a cluster occurs according to the maximum like-

lihood (M L) criterion.

Modelling the Noise

Let us now define the ground truth value xgsn as the soft biometric trait

n of user ω and ˜xsn as the l

th _{value measured by the system ( ˜}_X

sn(ω) =

{x˜sn,1(ω), . . . ,x˜sn,L(ω)}, whereLis the total number of measurements. For

an adequately large number T = M ×L of measurements, the noise dis-

tribution that is induced as error in the measurement (i.e. noise) by the system can be estimated as described hereafter.

As long asT is large enough for reliable statistical estimates, the normal-

ized valuesesn,l(ωm) = ˜xsn,l(ωm)−x

sn(ωm) can be produced. Having these

data for the whole registered population, it is trivial to fit the normalized

values distribution by a 1D Gaussian Mixture of the following type:

p(es|ω) = K X

k=1

πkNp(es|µk, σk)

where N_p(es|µk, σk) stands for the kth single Gaussian distribution that

contributes to the mixture. The valuesπk,µkandσkcan be easily computed

by utilizing the iterative Expectation-Maximization (EM) algorithm on the data’s histogram, until convergence.

The initial parameter regarding the number K of single Gaussian distri-

butions in the 1Dmixture model is experimentally selected, as the one that

produces an acceptable error value in theχ2−test:

χ2≡ B X b=1 (Ob−Eb)2 Eb , (5.5)

of samples in each bin andEb ≡T p(xs). Once the two parameters, namely

the degrees of freedom and the minimum allowed confidence f are set for

the test, the value of χ2 is cross-checked in the corresponding statistical

tables. If it is below the corresponding threshold, it can be claimed that the data are compatible with the imposed mixture model with confidence

f [225].

Consequently, p(es|ω) can be calculated as

p(es|ω¯) =

p(es)−p(ω)p(es|ω)

1−p(ω) (5.6)

wherep(ω) = _M1 and p(es) = _L1 are priors.

It should be noted that the augmentation process is applied only to these

users, whose soft biometric traits resemble the claimed ones. Yet, it is

important to highlight that the previous frameworks for augmenting biometric recognition with soft biometrics assumed independence between the soft biometrics in an ad-hoc manner, which does not hold per se. On the contrary, the independence between the inserted systematic error for each soft biometric is guaranteed by definition, since the distribution models refer to the measurement errors (not to the soft biometric traits) that are produced from independent measurement processes.

In this context, the goal herein is to find a generic expression of the con-

ditional probability p(ω|xc, es1, . . . , esN) that denotes the final recognition

score:

p(¯ω|xc, es1, . . . , esN) = 1−p(ω|xc, es1, . . . , esN) (5.7)

while according to Bayes’ theorem

p(¯ω|xc, es1, es2, . . . , esN) =

p(xc, es1, es2, . . . , esN|ω¯) p(xc, es1, es2, . . . , esN)

p(¯ω) (5.8)

The nominator can be analyzed as following:

While regarding the denominator the following holds:

p(xc, es1, es2, . . . , esN) =p(xc)p(es1)p(es2)...p(esN) (5.10)

since by definition the geometric gait signature is uncorrelated to the soft biometric error measurements

p(xc, es1, es2, . . . , esN) =p(xc)p(es1, es2, . . . , esN) (5.11)

and provided that the latter stem from distinct measurement processes and

thus variablesesn are held as i.i.d.

p(es1, es2, . . . , esN) =p(es1)p(es2)...p(esN) (5.12)

In this context, Equation (5.8) can be expressed as the combination of Equation (5.9) with Equation (5.10), as

This way, provided that

p(¯ω|xc) =p(xc|ω¯)

p(xc)

p(¯ω) (5.15)

Equation (5.14) can be written as following

p(¯ω|xc, es1, es2, . . . , esN) = p(es₁|ω¯)p(es₂|ω¯)...p(e_sN|ω¯)p( ¯_pω_{( ¯}|_ωxc₎)p(xc) p(xc)p(es₁)p(es₂)...p(e_sN) p(¯ω) = p(es1|ω¯)p(es2|ω¯)...p(esN|ω¯)p(¯ω|xc) p(es₁)p(es₂)...p(e_sN) = _p₍_e 1 s1)p(es2)...p(esN)p(es1|ω¯)p(es2|ω¯). . . p(esN|ω¯)p(¯ω|xc) = _p₍_e 1 s₁)p(es₂)...p(e_sN) N Q n=1 p(esn|ω¯)p(¯ω|xc) (5.16)

A = 1

p(es1)p(es2)...p(esN)

(5.17)

consists of priors and is constant for any usersω. This way,

p(¯ω|xc, es1, es2, . . . , esN) = A N Y

n=1

p(esn|ω¯)p(¯ω|xc) (5.18)

where the termfb =

n=1p(esn|ω¯) is the attenuation factor.

5.3. Summary

The work presented in the current chapter is a significant contribution in the field of multi-biometric systems. In particular, the current chapter aims at highlighting the contribution derived by the combination of soft anthropometric traits, that can be captured unobtrusively. Herein, two alternatives are delivered for the enhancement of existing biometric systems via the uti- lization of anthropometric or soft biometric traits that can be efficiently captured, without requiring any additional sensors and without imposing any significant processing encumbering on top of the baseline biometric system.

In particular, the chapter was divided in two main sections. Initially, it has been attempted to further evaluate the recognition capacity of the features that can be extracted from the upperbody. In particular, the static lengths of the limbs of the upperbody of each user (i.e. the length of the arm sections, the length of the shoulders and the height of the head) are estimated and their recognition potential is evaluated. In particular, an attributed graph matching based methodology has been proposed for the validation of the skeleton lengths of the person requesting authentication.

Although the aforementioned anthropometric characteristics of the upperbody can be easily represented as a fully connected graph, this is not the general case for other soft biometric traits. Moreover, the graph based approach treats the static biometrics as a separate biometric modality, the result of which should be then fused with the baseline biometric system. However, fusion of biometrics is a computationally expensive process, while given the limited recognition capacity of soft biometrics in general in large dataset [222] [66], the final result may not always be in favour of the recog-

nition performance.

For this reason, and a generic probabilistic framework for boosting the client recognition has been presented, utilizing continuous soft biometric traits. In order, to exhibited the general applicability of the p Indicatively, it can be mentioned that the improvements in the given datasets (see Section

3.2.2) are characterized by an improvement of∼2.5% and>20% on average

6. Experimental Evaluation of

In document Estudio de los factores de riesgo mecánicos en el área de producción de lo empresa Perfilplast y propuesta de medidas de control (página 68-75)