• No se han encontrado resultados

Capítulo 3: Análisis Financiero

3.3 Estructura de Costos de los sistemas de visualización

In order to assess the usefulness of the several attributes employed by these classifiers, they are compared with similar models trained with fewer features. This assessment uses 10-fold cross-validation on the training data. The support vector machines are used for the three tasks, as these classifiers perform best. It is performed in two different conditions.

In the first condition, each of the features is removed at a time from the optimal feature set, and the classifiers obtained are evaluated. This results in a ranking of

the various classifier features according to their usefulness. The most useful feature is the one with the greatest impact on classifier performance, i.e. the feature which the lowest scoring classifier is lacking. For the sake of illustration, the left part

of Table 4.8 shows this ranking for Task A Event-Timex. There, it can be seen

that the feature with the highest impact on classifier performance is the feature predictor-parser. Removing this feature from the optimal set of features results in a classifier with an accuracy that is 2.6 percentage points lower than that of the classifier trained with the optimal feature set.

In the second condition, each feature is removed successively from the optimal feature set, starting from the best feature and finishing with the worst one. At each point, the remaining features are reevaluated. That is, in a first step, the features are ranked in the same way as in the first condition. This feature is then removed from the feature set. This ranking operation is performed again on this reduced feature set and the new best feature is removed. This procedure is conducted recursively

until no feature is left. For right-hand side of Table4.8shows the result for Task A

Event-Timex. As can be seen from there, performance degrades very rapidly when multiple features are successively removed. This condition also produces an ordering of features according to their impact on the classification scores, but one that takes feature interactions into account: the second best feature becomes the best feature left once the very best feature is removed from the original optimal feature set.

Comparing these two orderings can shed some light on interactions between features. Sometimes, the usefulness of one feature depends on the presence of other features.

Task A Event-Timex For Task A Event-Timex, the features revealed as the best

ones are presented in Table4.8, under these two conditions.

The number after each feature describes the impact on the performance of the classifier trained with that feature removed from the feature set. More specifically, it is the difference between the model trained with that feature removed and the classifier using the full feature set. The features that are shown in this table are the five ones whose removal had the most dramatic impact on the classifier scores. The difference in scores is statistically significant only for the feature predictor-parser, for a significance level of 0.05, according to Weka’s PairedCorrectedTTester. When

4.5 Feature Selection and Results

Individual removal of features

Individual Feature impact predictor-parser -2.6 event-intervening-following-tense -1.9 closure-B-for-A -1.5 predictor-dep-parser -1.1 event-temporal-direction -1.1

Successive removal of features

Cumulative Feature impact predictor-parser -2.6 event-intervening-following-tense -4.8 predictor-dep-parser -7.2 closure-B-for-A -9.9 timex3-relevant-lemmas -13.6

Table 4.8: Ablation analysis of the SMO classifier for Task A Event-Timex

removed individually, the other features do not produce statistically significant dif-

ferences. The best features, according to the second condition, are also in Table4.8.

From this table, it can be seen that the most informative feature is the one

based on the phrase structure parser (Section 4.4.5). After removing this feature,

the feature event-intervening-following-tense (Section 4.4.6) is the strongest. This

feature records the tense of another event in the sentence, namely the one closest to the temporal expression when both are mentioned after the event in the temporal

relation. This feature can be useful for examples such as the one in (29), repeated

here in (36).

(36) a. Soviet Foreign Ministry spokesman Yuri Gremitskikh said special am-

bassador Mikhail Sytenko left Tuesday for consultations with the gov- ernments of Syria, Jordan, Egypt and other Arab countries.

b. O porta-voz do Ministério dos Negócios Estrangeiros soviético Yuri

Gremitskikh disse que o embaixador especial Mikhail Sytenko par- tiu terça-feira para consultar os governos da Síria, Jordânia, Egito e outros países árabes.

In this example, there is a temporal relation between said and Tuesday. The event is after the date. The fact that there is another event, the one denoted by left, closer to the time expression, is an indication that the time expression modifies this other event, and thus describes the time when this leaving event happened, rather than the saying event. The fact that left follows said in the sentence is a cue to the syntactic relation between the two verbs: left is the main verb of the complement clause of said. This and the tense of the two verbs is a strong indication that the event for said is after the event for left, even more so in Portuguese, where the two

Individual removal of features Individual Feature impact event-simplified-tense -2.9 previous-temporal-relation-type -1.6 event-class -1.4 event-closest-to-event-class -1.1 event-closest-to-event-simplified-tense -0.9

Successive removal of features

Cumulative Feature impact event-simplified-tense -2.9 predictor-parser -5.1 predictor-dep-parser -9.1 event-class -12.8 event-indicator-st1 -14.8

Table 4.9: Ablation analysis of the SMO classifier for Task B Event-DocTime

perfective past forms only allow this possibility (see Section5.3.3.4, specifically on

this phenomenon).

After that, the other feature based on parsing, predictor-dep-parser (Section4.4.5), is the best. Temporal deduction also ranks high in both conditions.

Task B Event-DocTime The ablation tests for Task B Event-DocTime are shown in Table4.9. For Task B, each of the three best features (event-simplified-tense, previous-temporal-relation-type and event-class) produces statistically significant dif- ferences when it is individually removed from the complete feature set.

A number of comments can be made about Table 4.9. Tense remains the

most informative feature. It seems that the information provided by the tem-

porally decorated grammatical representations (the features predictor-parser and predictor-dep-parser) is somewhat redundant with that provided by tense (the feature event-simplified-tense), as they have a much bigger impact when tense is not available (the two right columns of the table) then when it is (the left columns of the table). This is because, in these automatically produced representations, the temporal re- lations between a verb and the DCT are mostly based on the grammatical tense of the verb.

The temporal relation between the previously mentioned event and the DTC (the feature previous-temporal-relation-type) is another useful feature, but it is de- pendent on the tense of the event in the current temporal relation (the feature event-simplified-tense). When this tense is known (first condition) this feature has high impact, but once tense is removed this feature becomes much less useful (sec- ond condition). Support vector machines are difficult to inspect by humans, so if

4.5 Feature Selection and Results

Individual removal of features

Individual Feature impact event-simplified-tense 1 -4.6 event-class 2 -1.7 previous-instance-event-tense 1 -1.2 event-class 1 -1.2 predictor-parser -1.1

Successive removal of features

Cumulative Feature impact event-simplified-tense 1 -4.6 event-class 1 -5.2 event-temporal-direction 1 -7.8 event-class 2 -9.1 event-temporal-direction 2 -9.7

Table 4.10: Ablation analysis of the SMO classifier for Task C Event-Event

one wants to look at the inside of the models induced from these data, other algo- rithms must be used (although this must be viewed with a grain of salt as different algorithms have different properties about the way in which they can separate the instances into classes). Inspecting a decision tree trained on the training data with the J48 algorithm and just the three best features (according to the left side of

Table4.9) shows that the information about the previous temporal relation is used

to disambiguate Task B temporal relations involving present tense verbs, which can enter OVERLAP, BEFORE and AFTER relations with the DCT. The event-class fea- ture (which is part of the initial set of features used in the baselines) is also one of the three features producing statistically significant differences, as mentioned. In- spection of a similar decision tree trained with a reduced set of features as well as the distribution of the values for this feature in the training data reveals that the useful bit of information is when this feature takes the REPORTING value, for verbs like say, announce, etc. The reason is that the temporal relation with the DCT is practically never AFTER: reporting events are almost never future events. This may be particular to the corpus used (it is made of news articles), or it may also be seen in other types of texts. It is an interesting piece of information about the world that is not captured by the new features developed in our work.

Task C Event-Event The ablation tests for Task C Event-Event are shown in

Table4.10. Since Task C is about temporal relations between two events, the features

that describe properties of these events come in pairs. In this table, the number 1 after a feature’s name indicates a feature describing the event that is the first argument of the temporal relations, the number 2 is used with features describing the second argument.

Instance

number Feature Vector Class

1 <TRUE, TRUE> A 2 <TRUE, TRUE> B 3 <TRUE, TRUE> B 4 <TRUE, FALSE> B 5 <FALSE, FALSE> A 6 <FALSE, FALSE> A 7 <FALSE, FALSE> A 8 <FALSE, FALSE> B 9 <FALSE, FALSE> B Table 4.11: A hypothetical set of instances

For Task C Event-Event, only the best feature produces statistically signifi- cant differences when removed from the complete feature set. This is the feature describing the simplified tense of the event that is the first argument of the tem- poral relation. The second best feature is not about the tense of the second event, but rather its annotated class, which contains some information of aspectual type,

namely a binary distinction between states and the remaining types (Section3.3.1

and Section4.4.2). As mentioned before in Section 4.2, a state as the second event

is expected to go with overlap relations more than a non-stative situation, because states tend to be used to describe the ways things were when the previously men- tioned events occurred whereas non-stative situations move a narrative forward. A decision tree trained with just these two features for this task indeed shows this sort of association.

Documento similar