The long-term objective of this thesis is to advance toward incremental individual recognition as a prerequisite for long-term human-robot social interaction. Social robotics is a growing research area based on the notion that human-robot social interaction is a necessary step toward integrating robots into human’s everyday lifes [3] and for some, also a crucial element in the development of robot intelligence [26, 11, 17].
[32] presents a survey of different approaches in developing socially interactive
robots. These systems vary in their goals and implementations. The following robots are mainly focused on one-on-one and shorter-term interaction in controlled environ-ments. Kismet at MIT is an expressive active vision head robot, developed to engage people in natural and expressive face-to-face interaction [11]. The research motiva-tion is to bootstrap from social competences to allow people to provide scaffolding to teach the robot and facilitate learning.
WE-4R at Waseda University is an emotionally expressive humanoid robot, devel-oped to explore new mechanisms and functions for natural communication between humanoid robot and humans [62]. The robot has also been used to explore emotion-based conditional learning from the robot’s experience [61]. Leonardo at MIT is an embodied humanoid robot designed to utilize social interaction as a natural interface to participate in human-robot collaboration [13]. Infanoid at National Institute of In-formation and Communications Technology (NICT) is an expressive humanoid robot, developed to investigate joint attention as a crucial element in the path of children’s social development [47].
There have also been a number of approaches in developing social robotic plat-forms which can operate for longer time scales in uncontrolled environments outside the laboratory. The Nursebot at Carnegie Mellon University is a mobile platform de-signed and developed toward achieving a personal robotic assistant for the elderly [63].
In a two day-long experiment, the Nursebot performed various tasks to guide elderly people in an assisted living facility. Similar to our findings in dealing with uncon-trolled environments, the Nursebot’s speech recognition system initially encountered difficulties and had to be re-adjusted during the course of the experiment. Grace at CMU is an interactive mobile platform which has participated in the AAAI robot challenge of attending, registrating, and presenting at a conference [91].
Robovie at ATR, an interactive humanoid robot platform, has been used to study long-term interaction with children for two weeks in their classrooms [43]. Keepon at NICT, is a creature-like robot designed to perform emotional and attention exchange with human interactants, especially children [48]. Keepon was used in a year and a half long study at a day-care center to observe interaction with autistic children.
Robox, an interactive mobile robotic platform, was installed for three months at the Swiss National Exhibition Expo 2002 [89]. RUBI and QRIO at the University of California San Diego are two humanoid robots which were embedded at the Early Childhoold Education Center as part of a human-robot interaction study for at least one year on a daily basis [67]. Robovie-M, a small interactive humanoid robot, was tested at in a two-day human-robot interaction experiment at the Osaka Science Museum [88].
Most relevant to our project focus is Valerie at Carnegie Melon University, a mobile robotic platform designed to investigate long-term human-robot social interaction [36]. Valerie was installed for nine months at the entranceway to a university building.
It consists of a commercial mobile platform, an expressive animated face displayed on an LCD screen mounted on a pan-tilt unit, and a speech synthesizer. It uses a SICK scanning laser range finder to detect and track people. People can interact with Valerie by either speech or keyboard input. Similar to our case, the authors report that a headset micropone is not an option and therefore the robot’s speech recognition is limited especially given the noisy environment. Valerie recognizes individuals by using a magnetic card-stripe reader. People can swipe any magnetic ID cards in order to uniquely identify themselves. One of Valerie’s primary interaction modes is storytelling through 2-3 minute long monologues about its own life stories.
During these nine months, people have interacted with Valerie over 16,000 times, counted by keyboard input of at lease one line of text. An average of over 88 peo-ple interacted with Valerie each day. Typical interaction sessions are just under 30 seconds. Out of 753 people who have swiped an ID card to identify themselves, only 233 have done it again during subsequent visits. Valerie encounters 7 repeat visitors on average each day. These repeat visitors tend to interact with the robot for longer periods, typicaly for a minute or longer. The authors suggest that in order to study true long-term interactions with Valerie, the robot needs to be able to identify repeat visitors automatically. Moreover, Valerie should not only identify but also get to know people who frequent the booth.
We have the common goal of extending human-robot social interaction. Moreover,
Valerie’s setup in the midst of passersby and public environments is similar to ours.
However, Valerie has been installed and tested for a much longer period. In terms of user interface and perceptual capabilities, Mertz differs from Valerie in a number of ways. Mertz can only interact with people through visual and verbal exchange. Thus, it can rely on only noisy camera and microphone input in its interaction with people.
Mertz is a mechanical robotic head and is more expressive in terms head postures.
Valerie’s flat-screen face was reported to have difficulties in expressing gaze direction.
However, Mertz only has four degrees of freedom allocated to its facial expression, allowing a much smaller range than an animated face.