EXTENSIONES O TRAYECTOS SECUNDARIOS
7. EL INFORME RADIOLÓGICO
The main experimental platform consists out of two fixed 7-DOF industrial ma- nipulator arms that carry a Shadow “Dextrous Hand” each, and the anthropomor- phic robot torso BARTHOC (Hackel et al., 2005). The industrial manipulators are mounted from the top, the humanoid torso is situated at the back. A view of the setup from the point of view of human partner is shown in figure 2.3(a) and fig- ure 2.3(b) shows a birds-eye sketch of the setup. Besides the visible hardware, there is a camera (640x480, RGB) looking down at the table from the ceiling. Two stereo microphones left and right of the humanoid robot provide speech localization, and a head-set is used for speech recognition. Lastly, a CyberGlove II (LLC) provides hand posture sensing.
(a) View from the human’s position
Table Bowl Visible area BARTHOC Human 180cm 131,8cm 97,4cm (b) Sketch of setup Figure 2.3.: The static interaction setup.
2.3.2. Development History
Several different integrated systems have been created over the course of this thesis, as specified in table 2.1. The dates given are the completion dates for integration. Partial integrations or developments on the level of individual components which did not result in a significant change of the scenario have not been listed.
Date Name Comments
Feb 2008 Pre-Study BARTHOC only, no manipulation, cf. 9.1 Jun 2008 Curious Robot Bi-Manual Interaction, cf. 2.3.3
Mar 2009 CeBit 2009 No manipulation, first gaze feedback, cf. 9.5.1 Nov 2009 Final Demonstrator Haptic learning, advanced dialog, cf 9.5.2.
Of these scenarios, the June 2008 scenario has been the one with which the most user testing was carried out. A version of the scenario demonstrated at CeBit trade fair 2009 also received a lot of user exposure, but it was limited in functionality and the trade fair situation precluded capturing personalized data. As of the time of this writing, the final demonstrator has been demoed and tested at various occasions, but not with naive users.
2.3.3. Interaction Overview
The overall goal in the scenario is that the robot learns facts about everyday objects, specifically, the names humans use for them, and which of its grasp types it may apply. This information is determined in an interaction setting where these objects are present, through interaction with a human partner in a natural speech dialog. Once it has acquired the necessary information, the robot will move away the object and proceed to the next one.
At any point during this basic scheme, the human may interject with corrections, requests to abort an action, requests to present the knowledge acquired, and so on. Doing this during the activities themselves is much more efficient than after, e.g. to abort an action as soon as the human has detected an error, or filling wait time by asking for knowledge. This should of course be possible without requiring detailed information about the implementation, e.g. by learning from the robot’s explanations.
Speaker Dialog-Act Verbal Non-verbal
1 Human - Hello, robot. -
2 Robot Greet Hello. -
3 Robot Learn label What is that? Point
4 Human - . That is a banana. -
5 Robot Confirm hypothesis. Banana. OK. -
6 Robot Learn grip How can I grasp the banana? -
7 Human - . With power grasp. -
8 Robot Confirm hypothesis. Power grasp. OK. - 9 Robot Explore grip I am going to grasp the banana. - 10 Robot Confirm OK, I start grasping now. Grasp
11 Human - Stop! Release
12 Robot Abort action OK, I stop. -
13 Human - Grasp the banana! -
14 Robot Confirm start OK, I start grasping now. Grasp
15 Robot Confirm end OK.
16 Human - Good bye. -
17 Robot Say goodbye Good bye. -
Table 2.2.: Example Dialog (adapted fromL¨utkebohle et al.,2011, table 2)
2.3. The “Curious Robot” Scenario They are also reproduced at full size in appendix B. Please note that the images have been taken from the initial version of object learning system (cf. section 2.3.2), as it was the one that has received most user testing.
(a) “What is that?” (b) “That’s a banana.” (c) “Banana, OK”
(d) “How do I grasp...” (e) “... power grasp” (f) “Grasping now”
Figure 2.4.: Frames from the interaction sequence with utterances overlaid.
2.3.4. System Capabilities
In the capabilities of the system, a trade-off had to be achieved between the overall complexity and realistic interaction challenges. On the one hand, it was felt that the project held enough challenges in manipulation and interaction (see next section) and that, therefore, perception had to be simplified to make the project feasible. On the other hand, the characteristics of perception are an important influence on the interaction and must not be stubbed out entirely.
Therefore, object detection and learning has been included and must run quickly, so that it can be performed during the interaction. This was achieved at comparatively little effort by using simple features and a black background. Furthermore, real speech recognition was used, because any real interaction system will have to be able to cope with the errors introduced by imperfect recognition. However, to prevent the robot’s noise (which is substantial, both during operation and otherwise) from impairing recognition too much, a head-set was used for speech input and the stereo microphones were only applied for localization, which is more robust.
Correspondingly, the challenges have been selected to further the interaction: • No prior object knowledge is present in the system, both regarding an object’s
of these aspects must be learned. Naturally, the position of objects is also variable.
• No unnatural interaction restrictions. In particular, the interaction partner should not need to be known in advance, both regarding position and appear- ance. This also requires speaker-independent speech recognition.
• No prior instruction The human interaction partners should not require prior instruction to operate the system1.