• No se han encontrado resultados

2. MARCO TEÓRICO Y REFERENCIAL

2.3 Sistemas Contables

2.3.1 Manual Contable

The key component of an ensemble system is its strategy for combining classifiers.

The combination rules can be divided into two groups: 1) trainable vs non-trainable combination rules; 2) combination rules on class labels vs class specific continuous

outputs. The parameters of the combiner in trainable combination rules are called weights, which are determined through a separate algorithm. Trainable results in com-

bination parameters are instance specific, and are called dynamic combination rules (Guermeur, 2002). Conversely, in non-trainable rules, there is no separate training

involved for creating ensembles.

The combination rules which are applied to class labels only require classification

decision. Other combination rules that are applied to continuous outputs require value of classifiers; others require continuous valued outputs of the classifiers. These values

show the degree of support the classifiers devote to each class. In this chapter we will discuss the combination rules that apply to class labels, followed by combination

rules based on class specific continuous outputs.

3.8

Combining Class Labels

In combining class labels, we assume that only the class labels were available from

classifier outputs. The decision of the tth classifier can be defined as: dt.j ∈ {0,1},

where t = 1, . . . T and j = 1, . . . C, T is the number of classifiers and C represents

the number of classes. If tth classifier selects class wj, thendt.j = 1 and 0, otherwise.

3.8.1

Majority Voting

There are three scenarios of majority voting where the ensemble selects the class: (1)

where all the classifiers agree (unanimous voting); (2) predicted more than half the classifiers (simple majority); and (3) that achieves the highest number of votes, and

voting)(Bagui, 2005). The ensemble decision based on plurality voting can be shown

as follows: select class wj can be represented from (3.1) (Mohan, Papageorgiou, &

Poggio, 2001). Ti=1 dt,J =maxCj=1 Tt=1 dt,j (3.1)

Majority voting rule is an optimal combination rule for class labels, which operates

under certain assumptions.

3.8.2

Weighted Majority Voting

If we have the knowledge and expertise to differentiate experts from non-experts,

then weighting the decisions of experts may result in improvement in accuracy and performance, which can be achieved through plurality of voting (Erdem, Polikar,

Gurgen, & Yumusak, 2005). Let us assume the decision of hypothesis ht on class

wj as dt.j such that dt.j is 1 , if ht selects wj and 0. It is further assumed that we

can predict the future performance of each classifier, and we can assign a weight ωt

to classifier ht in proportion to its estimated performance with the above notation.

The classifiers whose decisions are aggregated through weighted majority voting will choose class J which can be represented from (5.2) (Valentini & Masulli, 2002), if

Tt=1 wtdt,j =maxCj=1 Tt=1 wtdt,j (3.2)

that is, and if the total weighted vote received by the wj class is higher than

the total vote received by any other class. In simple words, we can normalise these

weights so they can sum up to 1. However, normalisation does not alter the results of weighted majority voting (Polikar, 2006).

3.8.3

Behavior Knowledge Space (BKS)

The Behavior Knowledge Space (BKS) works on the principle of look up table ap- proach, which is constructed on the basis of classification of training data that keeps

2000; Raudys & Roli, 2003). The particular labeling combination for the true class is

observed during training and selects every time when that combination of class labels when it occurs during testing. The BKS procedure is best described with an example,

illustrated in Fig.3-3 (Polikar, 2006). We assume that we have three classifiers,C1,C2

Figure 3-3: Behavior Knowledge Space Illustration (Polikar, 2006)

and C3 for a three class problem. 27 possible labeling combinations can be calcu-

lated, listed from {w1, w2, w3}. These can be selected by three classifiers. During

training we keep an eye how often a combination occurs. Sample numbers are given

to each combination and each class, whereas maxima is circled in Fig.3-3 (Polikar, 2006). The combination of {w1, w2, w3} occurs a total of 28 times, of which 10 are

true class of w1, 15 are true class of w2 and in 3 are true class of w3. The winner in

the combination is w2, the most frequently observed true class for this combination

of labels. Therefore, during testing when the combination of w1,w2,w3 occurs, the

ensemble selects w2.

3.8.4

Borda Count

Borda count is different from other rules and methods in a way that does not ignore

the support of non-winning class (Nanni & Lumini, 2008). Borda count is deployed when the classifiers rank and order the classes. That can easily be done if classifiers

result in continuous outputs. However, borda count does not rely on the values of continuous outputs but only the rankings. Hence it is a combination rule that can be

In borda count, each voter (classifier) ranks and orders the candidates (classes).

If for example there are N candidates, the first place candidate will achieve N-1 votes, the second candidate will receive N-2, with the candidate in ith place receiving Ni

votes. The last candidate will receive 0 votes. The votes of all classifiers are added, and the class with higher votes is selected as an ensemble decision.