Capitulo II: Marco Teórico
2.3 Definiciones Conceptuales
The experiments on the annotated video data described in this chapter pro- vide a very simple first approach to empirically grounding the assumptions concerning the availability of meaning independently of language. To this end, we made some simplifying assumptions. We were only concerned with features that were actively being attended to, following research on joint at- tention (Tomasello 2003), and we assigned hardly any socio-cognitive skills to the learner, beyond assuming that whatever situations are present between the previous utterance and the subsequent one constitute the set of candidate meanings for the current utterance.
Furthermore, we assumed that the features were independent within a sit- uation, thereby making no difference between bundles of features occurring together (properties always being the property of an object, events always having participants). This inherent structure of the situations may provide valuable cues for the learner. We will exploit this structure in the modeling work described in the later chapters.
Finally, starting from a set of semantic primitives is problematic. Although one can argue for a universal set of features underlying the semantics of all natural languages (Jackendoff 1990), typological research shows that such a
set at least has to be very flexible to accomodate the distinctions made in dif- ferent languages. Conceptualizing the space of potential meanings in terms of continuous scales rather than discrete features may prove to be a more insight- ful starting point (Bowerman 1993, Levinson, Meira, & the Language and Cog- nition Group 2003, Majid, Boster & Bowerman 2008) for describing language- specific categories. Beekhuizen, Fazly & Stevenson (2014) describe how we can use these continuous spaces to study semantic error patterns in language acquisition, showing how overgeneralizations can be predicted on the basis of continuous spaces and the insight that groupings of situations with one lin- guistic marker that are cross-linguistically more common, are probably also easier to acquire than groupings that are cross-linguistically less common.
One can always push realism further. I believe, however, that the current proposal at least provides more realism than input generation procedures hitherto proposed. With a computational model satisfying many constraints or desiderata imposed by usage-based theorizing and a realistic input gener- ation procedure, we can now see how the model behaves and what kinds of representations it acquires. These issues will be addressed in the subsequent three chapters.
Comprehension experiments
5.1
Measuring comprehension
The previous two chapters set out a computational model of early grammar acquisition and a procedure for generating realistic input items. The time has come to look at the behavior of the model given these two. In this chapter, we look at the ability of the model to understand the utterances it processes. Re- call that, at every turn, the model is presented with an utterance in the context of a number of situations, one of which may be the situation the speaker refers to. Can SPL, given noise and uncertainty in the situation, build up an inven- tory of symbolic units allowing it to comprehend the utterances? This question first requires us to define what understanding means in formal terms. That is: how do we define and operationalize ‘comprehension’?
Because the input items are generated randomly, we run10simulations of 10,000input items. The latter number was established on the basis of prior testing to be the amount of input items when most scores had become stable. Recall that the referentialuncertaintywas found to be15entities (events, en- tities) in section 4.3.2. Translating this to a number of situations, we set the number of situations co-present with the utterance to be6 (I will henceforth call the propositional uncertainty parameteruncertainty). It is hard to estab- lish a motivated number of situations, but given the overlap between situa- tions (given thecontinuationparameters), having six situations co-present is roughly equivalent to having15unique entities (not counting the roles). One of these six situations is the target situation, while the other five are distractors. Furthermore, we set the value for propositional noisr,Pnoise, to0.1, meaning
that in one out of ten situations, the target situation is absent.
5.1.1
General evaluation
A first measure of successful comprehension is the ability of the model to identify the target situationstargetout of all candidate situationS. Recall that SPL always identifies a situation sidentified as the situation the speaker was thought to refer to. The identification score of an input item, then, is 1 if sidentified = starget and 0 otherwise. Because the noise is set to 0.1 and the
uncertaintyto5situations, there are6situations in the situational contextSin
90%of the cases, and5in10%. In that latter10%, the model can, moreover, not retrieve the target situation, because it is simply absent. A chance baseline for
identificationis therefore0.9×1
6 = 0.15, or one out of six for all situations in
which the model can be expected to identify the target situation. Similarly, the maximum proportion of situations the model can correctly identify, or ceiling level foridentificationis0.9, as in10%of the cases, the target situation is not present.
The input items do not have a single correct mapping of the parts of the utterance to the target situation, and without such a gold standard, we cannot evaluate how well the linguistic analysis maps to parts of the situation. What we can evaluate, however, is what proportion of the utterance the model has processed, and what proportion of the identified situation (whether it is cor- rect or incorrectly identified) is being mapped to by the best analysis. The first of these,utterance coverageis given by the proportion of the utteranceU that is governed by rules other than ruleiii, i.e., the rule for ignoring words. In other words: the proportion ofU that is assigned a proper function in the analysis. LetUanalyzedbe the substring ofUthat is governed by rules other than ruleiiiin the derivations underlyingabest. The utterance coverage can then be given by:
utterance coverage=|Uanalyzed|
|U| (5.1)
The second of these measures,situation coverage, works similary, but ap- plies to the situation. The combined mappings of all constructions used in the best analysis specify a subgraph of the (correctly or incorrectly) identified sit- uationsidentifiedthat is analyzed byabest. Let us call this subgraphsanalyzed. The
situation coverageis then given as the proportion of vertices ofsthatsanalyzed constitutes, or:
situation coverage= |V(sanalyzed)|
|V(sidentified)|
0.00 0.25 0.50 0.75 1.00 0 2500 5000 7500 10000 time identification
Figure 5.1: Identification scores for 10 simulations over time.
5.1.2
Evaluating the used representations
Foreshadowing the study of the representations acquired by SPL in section 6, we can also inquire what the representations are that the model actually uses. For the grammatical constructions, two interesting parameters are their length (in number of constituents) and their abstraction. From Brown’s law of cumu- lative complexity, it follows that the inventory of linguistic representations grows more complex over time, which I take to mean that the representations become longer and the number of abstract slots increases. How this affects the choice of representations that the model actually uses in comprehension, is not evident from Brown’s law itself.
Furthermore, we cannot speak of true ‘evaluation’ of the used represen- tations: after all, we simply do not know what representations an actual lan- guage user employs when trying to comprehend an utterance. In section 5.3 we will look at the representations and mechanisms the model employs in an- alyzing input items, and compare them to hypotheses within the usage-based framework.