NOTE DELL'AUTORE
4. Orgoglio e attivismo
I now discuss how my initial hypotheses have been confirmed or not, conclusions arising from the results, and the limitations of the study and plans for further work.
4.7.1 Confirmation of hypotheses
Hypothesis (1)- Self-repairs have a systematic surface form.
This is the case, however there is a long tail to substitutions which means a string alignment approach to surface form is not adequate.
Hypothesis (2)- Position of interruption point contributes to predicting the type of self-repair.
This study shows there is an effect of position, with repeats being more likely to occur utterance-initially, and substitutions and deletes less so, with the probability of substitutions peaking at 4-6 words in to the utterance. The probability of a delete gradually rises through the utterance, while after the initial disparity substitutions and repeats remain roughly equally probable.
Hypothesis (3)- the presence and type of an edit term contributes to predicting the presence and type of self-repair.
The presence of edit terms alone is surprisingly under-predictive of repair, with the most predic- tive form ‘uh’ only raising the probability of an upcoming repair onset to 0.155. However, if an interregnum is correctly identified as one, it helps predict type. Deletes are more likely to have interregna than substitutions, and substitutions are more likely to have them than repeats.
4.7. Discussion 134
Hypothesis (4)- the syntactic context of an utterance (the partial tree) can contribute to predicting the form and structure of a self-repair.
This is true to a degree– there is still a sparsity problem for scaling this to an automatic detector, however the results in terms of tree path length are encouraging. Speakers are less likely to repair material beyond the main constituent boundary in the partial tree they are constructing.
Hypothesis (5)- Processing context (fluency of ongoing utterance) can help predict the occurrence and type of a repair.
This is very much the case, with repair contagion being found within utterances, and also in chaining (embedded) repairs.
Hypothesis (6)- Self-repairs can be interpreted by interlocutors and annotators as having a particular dialogue function.
The annotation of the meaning of self-repairs is problematic, however progress is being made. The repair phase of the bracketing structure in the Switchboard annotations may not always be suitable for defining the function of the repair as a replacement of the reparandum by the repair, and deletes seem to be under-annotated. This may be one of the reasons the task of classification, or even assigning the bracketing structure as a whole has been avoided in evaluation in automatic repair detection in the literature. What is clear is that repairs can perform a range of dialogue functions and annotators can come to some agreement within the taxonomy proposed, however there is surprising disagreement even at a coarse-grained level.
4.7.2 The interpretation of self-repair and the problem of annotating function
The strikingly negative result from this study on the annotation disagreement over repair types may be more interesting than it would first seem. The lack of use of the audio data during annotation may have contributed to the disagreement. As not only our function annotation, but the original Switchboard repair annotations were done after transcription (Meteer et al., 1995), so clearly more work can be done to improve the resources available to repair annotators. On the other hand, this may be evidence for a more gradient effect of self-repair interpretation, whereby it is not clear whether a given repair only functions to cancel commitments to part of the utterance or whether it uses the reparandum to elaborate further content. The following examples show possible alternative interpretations (italicized) to the Switchboard annotations:
(4.19) “and [ there’s, + ?] it’s ] completely generic.” Substitution or delete? (sw4619)
(4.20) “a matter where priorities are [ at, + ] placed.?]” Delete or substitution? (sw4360)
The first type of disagreement shown in (4.19) where Switchboard annotation suggests a sub- stitution where we suggest a delete, is the most common disagreement. The bracketing structure used may need modification in future work, as its affordances mean annotators may be tempted to use the repair phase more than they should. The higher agreement scores based on categoriza- tion are a useful start in this direction, however introducing gradient categories may help further here as recent work on grammaticality judgements has shown (Lau et al., 2014).
4.7.3 From string alignment to incremental information processing
Several bits of evidence point to the fact that instead of viewing repair interpretation as a string alignment problem, it is more fruitful to see it as an incremental information processing one, where speakers try to minimise the amount of revocation of information possible in ongoing talk. This is made clear from the distributions of the repair types. Repeats are the most common, and add the least information (in fact no new information), acting as stalling devices much like isolated edit terms, and so Ginzburg et al. (2014)’s view of them being forward-looking problem indicators seems appropriate. Given the inverse power law of reparanda lengths, speakers try to repair as soon as possible as per Clark’s principle of repair (Clark, 1996, p.284), however I would add to this imperative of locality that this is done covertly when possible.
While a striking correlation, the fact that mid-utterance partial words predict a repair with near certainty may be a non-argument, as mid-utterance partial words are in fact reparandum ends– the signal that the speaker is having trouble is made as clear as possible by not complet- ing the word and starting the next one. These minimise the information change by revoking commitment to a potentially problematic bit of information.
Finally, there is clearly sensitivity by the speaker to previous repair, as shown in the contagion and embedding effects, and also use of edit terms to signal the nature of the upcoming repair. Preference for interregna in more severe information changing repairs, deletes and substitutions, shows they can be indicators of up-coming information revision. The interaction of reparandum length and interregnum presence also supports this, which suggests the more severe the upcoming