• No se han encontrado resultados

Posicionamiento flotante

Posicionamiento y visualización

7. Posicionamiento flotante

(Richardson et al. 2008) studied postural and gaze features, and reviewed a body of previous work in this area. The general goal was to investigate entrainment of conversants' body swing and eye focus, when standing in upright position. In a series of experiments, subjects were involved in several tasks which were designed to produce spontaneous speech, such as watching sitcoms and discussing their favorite characters, discussing a painting, or performing tasks in a common area through wall-mounted monitors, with or without visual contact with other subjects. During these experiments, the body swing (lateral movement of upper body in upright position) and eye movement and focus were recorded continuously.

Statistical analysis of the resulting time series was performed by means of recurrence analysis, a method which, according to (Richardson et al. 2008), is more straightforward in revealing recurrent (or cyclic) patterns by observation of recurrence plots. A point is registered on a recurrence plot only when events that occur at fixed intervals (recurrently) are sufficiently “similar” (within a preset threshold). Thus, the density of points registered along lines that represent specific periods yields the amount of recurrence for that period. The density can be expressed as a percent recurrence, the proportion of points registered on the plot vs all possible points. An extension of this method to bi- variate time series (which comprised cross-recurrence plots and percent cross-recurrence measures) was used to assess coordination among behavioral patterns of two participants.

Interestingly, (Richardson et al. 2008) found coordination of body swing even when the subjects were facing away from each other (interacting through monitors on opposite-facing walls), or when there was no visual contact (subjects interacting through monitors without visual contact). In addition, eye movement (gaze) coordination was found not only between partners in an on-going conversation, but also between listeners and speakers when the former were listening to a recorded description of a painting. (Richardson et al. 2008) concluded that there is transmission of rhythm through speech, and that this is not only a by-product of interaction but also has an effect on its outcome. (Richardson et al. 2008) proposed some evidence that common ground (Clark and Schaefer 1989) is relevant to coordination of gaze, as listeners could answer questions about the painting correctly more often when their gaze was coordinated to that of the speaker.

4.6 Discussion

This chapter has reviewed various methods of measuring accommodation phenomena in various modalities. Regardless of the theoretical foundations or goals, each study measured accommodation in one or more verbal or non-verbal features (see Table 4.1). It was mentioned in section 4.1 that these can be broadly categorized into across-dialogues and within-dialogue measurements.

Across dialogue measurements are the most robust method, as the whole dialogue is used to calculate an average value of a feature: if the dialogue is long enough, then the arithmetic mean can be safely assumed to be unbiased by some event that occurred during the interaction causing unusual behaviour which deviates from the mean. Provided that a sufficient amount of dialogues is available, conclusions can be drawn on whether accommodation generally occurs under specific conditions or not. Although this methodology produces informative results, there are two arguments against it: first, it has been argued whether this correlation is the result of accommodation or not. The alternative explanation provided, is that it may be a result of topic liveliness (Benus 2009), or of the overall liveliness of the dialogue (Bosch et al. 2005). Second, it fails to capture the dynamic evolution of accommodation over time as the dialogue progresses (Edlund et al. 2009).

Within-dialogue measurements can also be sub-categorized into continuous and non-continuous methods. Continuous methods consider utterances, turns, or other arbitrarily constructed units, on which a feature value can be measured or accumulated (averaged). These values are then located on a single point of the dialogue time-line. For example, the “center” of the utterance was used in (Nishimura et al. 2008), or a particular recurring syllable was used in (Kakita 1996). This process results in a time series for each speaker. Another option for creating a time series is to use the values from one speaker and linearly interpolated values from the second speaker at these points (Nishimura et al. 2008; Edlund et al. 2009). These time series are often simply inspected, in order to provide preliminary evidence of dynamic patterns (Kakita 1996; McRoberts and Best 1997; Campbell 2009). In other cases, the time series undergo statistical analysis, with one of various methods available in standard statistics handbooks (e.g. Chatfield 1996).

METHODOLOGY FEATURE CORPUS STUDY

Time series (lag regression) Rhythm, duration

coordination Mother-infant Jaffe et al (2001) Across dialogues &

Time series (plot observations) F0 accommodation Parent-infant McRoberts and Best (1997) GLMM, frames of fixed length after

prime Syntactic priming SpontaneousTask-oriented Reitter et al (2006) ANOVA, perceptual test of pronunciation

pre-task, task, post-task Phonetic convergence Task-based Pardo (2006) Time series (trend line fit) F0 convergence &

divergence Scripted answer-question pairs Kakita (1996) Across dialogues & Histograms Pause duration

overlaps Spontaneous face-to-face & telephone Bosch et al (2004, 2004b, 2005) Time series (spectral analysis) F0 and Intensity

synchrony

Laboratory Adult conversations

Buder & Eriksson (1997, 1999) Histograms & phase component Syllable & accent

timing entrainment SpontaneousElicitation Benus (2009) Superimposed time series plot

observations Multimodal synchrony Multi-party conversation (video) Campbell (2009) Time series (recurrence analysis) Swing & eye move-

ment entrainment

Task oriented Richardson et al (2008)

Time series (by interpolation) Pearson coefficient

Pause and gap length accommodation

spontaneous Edlund et al (2009) Linear regression, frames of fixed length

after prime

F0 & lexical convergence

Tutorial sessions Ward & Litman (2007,2007b) Across dialogues Speech rate

adaptation

Task-oriented (telephone)

Ward & Nakagawa (2004)

Time series (by interpolation) lag zero coefficient

F0, Intensity and speed synchrony

Spontaneous Nishimura et al (2008)

Percentage of success Lexical entrainment Spontaneous & WoZ & text

Brennan (1996) Same word/different word ratio Lexical entrainment WoZ – Automatic

translation

Fais (1996) Across dialogues, Half-split dialogue

ANOVA

F0, Intensity, speech rate, pause length

WoZ – Multimodal SDS

Oviatt et al (2002, 2002b,2004) Per turn type

ANOVA Speech rate adaptation WoZ – Multimodal SDS Bell et al (2003) Half-split dialogues t-test

Intensity, speech rate WoZ – Quiz SDS Suzuki & Katagiri (2003, 2004, 2005)

Table 4.1: Measurements of inter-speaker accommodation in various studies

The advantages of continuous (time series) methods are that (a) the variations in the feature value over time are captured, hence analysis can be performed on a single dialogue (McRoberts and Best 1997), and (b) that it is possible to determine whether only one or both speakers converge/diverge (Jaffe et al. 2001). In addition, it is possible to identify cyclical patterns to which it is possible to fit

models based on their periodicity (Buder and Eriksson 1997; 1999). In the latter study, a physical function was given to the period of the fitted sinusoids, namely that of rhythmic entrainment across the two speakers during turn exchanges (different periods were found for F0 and intensity). Similarly, (Jaffe et al. 2001) proposed an “optimal lag” which was found to be the most significant in a series of lagged regressions between the two time series. (Jaffe et al. 2001) proposed that this may be evidence of rhythm (periodicity) in dialogue interaction. Aside from the question whether such assumptions are valid or not, the findings themselves are proof that continuous approaches reveal much more information about accommodation than across-dialogue comparisons. The disadvantages of time series methods are the increased complexity (Edlund et al. 2009), and the fact that the usual assumptions for time-series analysis (stationarity, normal distribution of variance) are probably not satisfied in a strict sense (this is discussed in section 7.4.1).

Non-continuous methods encompass all other within-dialogue measurements: priming measurements, for example, make use of fixed-length frames that are defined by the location of the prime. Histograms display the distribution of values for a feature (such as pause duration), which can often provide valuable information. A somewhat crude method of measuring within-dialogue accommodation is the “half-split” approach: a dialogue is divided into two halves of equal length, and a feature average (for each speaker) is calculated for each half (e.g. Oviatt et al. 2004). This can be used to show whether speakers converged, diverged, or not. Although this method has been criticized for the same reasons as across-dialogue approaches (Edlund et al. 2009), it does combine merits from both, as the result is, in a sense, a two-point time series. One can imagine further splits into quarters etc, but there is a trade-off: unless the “pieces” are big enough, the average of a calculated feature may be biased by local events in the interaction.

In conclusion, time-series is the only analysis method which has been used so far to measure inter- speaker accommodation in a continuous way. Despite the disadvantages that were mentioned above, time series analysis provides the most complete description of accommodation phenomena and constitutes the most promising route towards a quantitative model that can be useful for SDS, as online monitoring and real-time accommodation pre-require a continuous description.

Documento similar