• No se han encontrado resultados

In the previous section, a module for tracking color salient regions and a module for detection prominent syllables in speech were described. Both modules communicate their hypotheses to the active memory, and thus, other modules within the system already have access to this information. However, some additions to the existing components are necessary to visualize the data and utilize the new information in the acoustic packaging system. In the following, these changes will be described first. Subsequently, the initial integration with the iCub platform (Metta et al., 2010) will be reported. Furthermore, findings on speech and object properties based on their local synchrony will be presented.

Chapter 6. Acoustic Packaging as a Basis for Feedback on the iCub Robot

Figure 6.4.: Cue visualization tool showing motion peaks (row 1), trajectory coordi- nates (row 2 shows x(t) and y(t)) acoustic signal energy (row 3), speech segmentation (row 4), visual segmentation (row 5), and acoustic packages (row 6).

Figure 6.5.: Inspection tool showing a list of acoustic packages with details on each package’s temporal extent and its associated segmentation hypotheses.

6.3.1. Additions to the Existing System Components

Including the new modules, the acoustic packaging system consists of five modules connected to the Active Memory (see Figure 6.3). The cue visualization and the inspection tool were extended to display trajectory data provided by the color saliency module. The cue visualization tool shows the temporal development of the x and y coordinates of the trajectory including the average color of each trajectory (see Figure 6.4). The inspection tool was extended to display the x and y coordinates as an overlay over the frames that mark the beginning and end of the current motion peak. This way, both the temporal development of each trajectory as well as its spatial accuracy and the relation to motion peaks can be analyzed. The segmentation of speech into syllables with their prominence rating is also displayed by the cue visualization tool as nested segments for each speech segment (see Figure 6.4). The inspection tool allows to replay prominent syllables for each speech segment.

Besides the changes related to visualization and inspection, the temporal association module was extended to additionally associate trajectories to acoustic packages. The association method follows a similar concept as for motion and speech. Overlapping trajectory and speech segments are associated to an acoustic package by the temporal association module. This step directly allows for an additional interpretation of the inter- action based on the content of acoustic packages. Acoustic packages with no associated trajectory likely do not contain significant changes to the items in the interaction. These packages are generated by communication, which does not involve moving items, as, for example, showing an item or talking to the interaction partner. The cue visualization tool highlights these packages in a different color (see Figure 6.4).

6.3.2. Acoustic Packaging as a Basis for Feedback on the iCub Robot

The acoustic packaging system was tested on the iCub robot (see Figure 6.6). For this purpose a feedback module was implemented which uses information from acoustic packages to provide feedback to the tutor (see Figure 6.3). Currently, the feedback module focuses on extrapolating word-meaning pairs out of running interaction. This is realized by a two step process: During an action demonstration with the caregivers’ verbal comments, the feedback module clusters acoustic packages by their trajectory color. For example, if the tutor — in the second step — shows a cup, but does not verbally comment his action, the robot provides feedback by replaying the most prominent syllable from one of the acoustic packages where the trajectory color matches the current one. This way the system communicates which information it has identified as relevant from the caregivers’ demonstration. Furthermore, the feedback module replays the trajectory using the right arm of the iCub by mapping the trajectory coordinates into a two dimensional plane in front of the robot. The mapping process acts as a bridge component between the Active Memory and the cartesian controller (Pattacini et al., 2010) of the iCub. On the one hand, the component monitors the Active Memory for trajectories which have

Chapter 6. Acoustic Packaging as a Basis for Feedback on the iCub Robot

Figure 6.6.: A human user demonstrates cup stacking to the iCub robot. The iCub observes the scene through one of its eye cameras. Speech is recorded using an external microphone (middle). Visualization tools of the acoustic packaging systems are displayed on the screen in the background.

been selected for replay. On the other hand, the mapped coordinates are sequentially communicated to the cartesian controller. Note, that this aspect of feedback is in an experimental stage.

First tests of the feedback module on the iCub robot showed that the robots response referred to semantically relevant parts of the utterance. However, to close the loop between tutor and robot strategies for handling corrections or other types of feedback regarding the quality of the acoustic packages have to be implemented. Such methods would allow for developing the system to adapt to the tutor and keep only those packages which maintain information also considered as relevant by the tutor.

6.3.3. Summary

The prominence detection module and the color saliency tracking module was added to the acoustic packaging system. The temporal association module includes these additional cues when forming acoustic packages. Furthermore, the system was tested on the iCub robot where it provides initial feedback using the prominent syllables and trajectory information that was linked to acoustic packages.

Documento similar