2 RODRIGUEZ SANTAMARIA LUCIA SANT SADURNI,CN 35.21 0:35.21 50 E BARCELONA 17/12/2017

Pos. Nedador/a Club Temps Real Lloc 1 E

2 RODRIGUEZ SANTAMARIA LUCIA SANT SADURNI,CN 35.21 0:35.21 50 E BARCELONA 17/12/2017

The LLDT as a way to understand adaptive control and eye movement behavior is interesting in its own right. As a way to understand reading and sentence processing, it is only interesting to the extent that it can be shown to be similar to more sophis- ticated reading tasks. Chapter 4 showed that the LLDT does show some empirical signatures of reading, such as proportions of single xations and regressions. On the other hand, Chapter6suggests that the wide spacing in the LLDT largely eliminates parafoveal preview that could yield spillover a convenient simplication, but also a departure from more closely-spaced reading behavior.

Therefore, one future direction must be scaling the model to more complex sentence- level tasks. The challenge is largely methodological and computational rather than theoretical: the Bayesian sequential sampling approach is agnostic as to both the likelihood function and the form of the belief distribution. Recall that the string- level belief update in the thesis computes two quantities: P (Sk _{= s}

i|ek, T = t), i.e. the posterior probability that the string in position k is some string si conditioned on incoming evidence from position k and the multinomial T corresponding to trial type; and P (T = t|ek₎, i.e. the posterior probability that the trial is of a particular type (word or nonword) given incoming evidence. T could instead be a probability distribution over sentence structures (i.e. a grammar), and the theory becomes in the abstract a theory of sentence processing.

Practically speaking, the extension is less simple. On the computational front the challenges are in specifying a likelihood function that lets words of dierent lengths be confused for each other, maintaining and updating a belief distribution over po- tentially innitely many sentence structures, and handling a broader set of actions than only saccading forward. On the methodological front, the challenge is in nd- ing a sentence-reading task with a decision variable that can be computed from the distribution over sentence structures. These challenges are briey considered below. Handling words of dierent length The letter-level unit-basis vector representation of strings in the thesis model does not enable a likelihood function that confuses strings of dierent lengths. This is because the length of the vector implicitly provides noiseless information about the length of the string. An alternative is needed

nite state transducer (wFST) inspired by Levenshtein distance. Levenshtein distance is a string distance metric that counts the number of insertions, deletions, and substitutions needed to transform one string into another. A wFST probabilistically transforms its input `tape' into an output `tape'. One can dene a wFST that takes in a sequence of letters and probabilistically inserts, deletes, or transposes letters to generate another sequence.

Such a transducer will probabilistically transform some true word being viewed into a sequence of letters. The likelihood of a particular word given this sequence of letters is the probability of generating the sequence from that particular word using that wFST. To compute it, one composes the deterministic nite state automaton representing the sample (letter sequence) with the reversed sample generation wFST. The resultant wFST represents a probability distribution over all the strings that could have generated the noisy sample (whether a word in the lexicon or not). As a side benet, moving to this formulation of the likelihood function makes its parameters possible to estimate from data, for example from letter confusability in a speeded recognition task.

Maintaining and updating belief over many structures A model that can pro- cess sentences will need to maintain and update its belief distribution over sentence structures. Exhaustively listing all possible structures for all but trivial problems is dicult, and becomes impossible when any recursion is introduced and the grammar produces an innite number of strings. The computational linguistics and natural language processing literature provides a number of solutions to this problem, how- ever.

One option, and the one chosen by Bicknell & Levy (2010a) in their sequential sampling model used to understand regressions, is to represent the grammar as a weighted nite state automaton (wFSA). They induce their grammar by counting bigrams in a large corpus. The primary advantage of this approach is that it is fast, easy to estimate from data, and can have broad coverage. The chief disadvantage is that wFSAs make it dicult to engage with much of modern (psycho)linguistic theory, which primarily works with context-free and mildly-context-sensitive grammars.

This leads to a second option, which is to use the state of an incremental parser to maintain the current posterior probability over sentence structure. One such parser is the Earley-Stolcke parser (Earley, 1970; Stolcke, 1995), which parses probabilistic context-free grammars (PCFGs) and has previously been used to predict expectation- based sentence processing diculty Hale (2001). In a footnote, Stolcke notes that

the input-recognition scanning action of the parser could be extended to probabilistic input. Because the Stolcke parser is a dynamic programming parser (i.e. keeps track of partial derivations), it may be possible to iterate the scanning step (this would be the posterior update), normalizing correctly at each timestep. This option would be slower than intersecting wFSAs but would allow use of a hand-written PCFG to investigate psycholinguistically-interesting phenomena. The disadvantages of this approach are that it will be slower than using wFSAs, and that while a PCFG is a more expressive formalism than a wFSA, still cannot represent concepts like movement, and other state-of-the-art aspects of linguistic theory.

A third option relies on a result from Bar-Hillel, Perles & Shamier (1961), who showed a method for intersecting a context-free language with a regular language, and that such an intersection is again context-free. Nederhof & Satta (2003) extended the result to a probabilistic grammar and probabilistic nite state automaton, and Chen, Hunter, Yun & Hale (2014) provide an implemented tool that intersects wFSAs with a weighted multiple context-free grammar (WMCFG), a class of mildly context-sensitive grammars that can be derived from minimalist grammars (Stabler, 1997). By iteratively intersecting wFSAs representing samples with the grammar, this toolchain should allow sequential sampling with a probabilistic minimal grammar as the representation of the belief distribution over sentence structures. But the method is very slow: informal testing shows that it is about two orders of magnitude slower than using the Earley-Stolcke parser, itself slower than the wFSA intersection method or the method in the thesis.

Tasks and decision variables in sentence-reading With the question of the belief representation and update tackled using one of the methods above, what remains is the set of decision variables.

For the eye movement decision, if the action is to only move forward, then the decision can still be conditioned on the max a posteriori string probability (as in Chapter 6). But Bicknell & Levy (2010a) showed that agents that can also move their eyes back to previous words perform better at any speed-accuracy tradeo than agents that cannot. They used threshold decisions for both the forward and backward eye movement actions. This larger policy space may work at a rst approximation, but may also turn out to be suboptimal. The general problem in both the LLDT and sentence-reading is that of choosing which sensor to attend to when there is a time cost of switching sensors (i.e. the saccade itself), and neither the `always go forward' nor the `go back when condence drops' policy space may contain the policy optimal

in this setting. While any model with such a richer policy space could be compared to human data using the same position-based measures used in the thesis, the inclusion of movements to words other than the next one would also motivate the inclusion of novel scanpath-based analyses (von der Malsburg & Vasishth,2011) and in particular explicitly computing similarity scores between the model's scanpaths and those of human participants.

For the trial-level decision, the model outlined above could support any decision variable that can be computed from a probability distribution over sentence structures. Judgments about structure like subject-verb relations (`was it the feisty ballerina who jumped?') or anaphor coindexing (`is himself referring to the tall milkman or the stout wrestler?') could be computed at each timestep and thersholded. A criticism of such a task may be that it does not tap into naturalistic reading, which likely involves discriminating between meanings rather than merely between structures. But the same criticism holds with respect to most psycholinguistic tasks, and in fact the verb relation question variant looks remarkably like a standard comprehension question in a psycholinguistic sentence-reading experiment). A notable exception to this characterization of psycholinguistic tasks is recent work on sentence processing during story comprehension (Brennan, Nir, Hasson, Malach, Heeger & Pylkkänen, 2012).

Moreover, in the computationally rational perspective the criticism about dissim- ilarity between the experimental task used and naturalistic reading is not necessarily valid because the notion of `normal reading' is ill-formed. If reading strategies vary with task goals (as Chapter 4 and other work shows) then the goal of a theory of reading is not to explain some abstract platonic notion of reading, but to build a theory of the way reading strategy is adapted to the joint constraints of agent and task. A theory of a sentence structure query task is very much on the path towards that goal.

In document 100 m. BRAÇA FEMENÍ 14 anys (página 101-104)