así la gradualidad de su veracidad dentro del dominio de la ar gumentación; (iii) definir un umbral de calidad en donde se es-
8.1. Elementos del Marco Argumentativo Etiquetado
We used multiple evaluation methods to determine whether our robotics classes affected the amount of collaboration shown among the children participating in our study, including semi-structured interviews, written questionnaires, and video analysis. Using a form of data triangulation, we compared each method’s findings against the others to both check the validity of each method’s findings as well as to synthesize all of the findings into valid conclusions. Additionally, we use the terms significant, marginally significant, and insignificant in describing the results from our data analysis to say that we used statistical confidence intervals of 95%, 90%, or
Figure 5.2: Left: The “turn-taking wheel” the children used. Right: The children play with the robots and interact with each other during class
less than 90% (p-values of 0.05, 0.1, and larger than 0.1), respectively, in describing our data.
Questionnaires
Written questionnaires using 7-point Likert scales1 were administered after each robotics class to both the children participating in our study as well as to their parents or carers. This was done to obtain multiple perspectives of the children’s behaviours; a similar technique is used on the Social Skills Rating System, or SSRS [Gresham and Elliott, 1990]. The items on the children’s questionnaires asked them to describe how enjoyable each robotics class was, how often they worked with others in their group during each class, and how easy it was to work with other children in their group during each class. The items on the parents’/carers’ questionnaires asked them how much their children seemed to enjoy each class, how often their
children collaborated with others during each class and how well they did so, as well as how sociably their children behaved outside of each robotics class. For both the parents’/carers’ and children’s questionnaires, a response of 1 meant the equivalent of “very little” and a response of 7 meant the equivalent of “very much”. The differences between children’s and parents’ responses to various questionnaire items were statistically analyzed using single sample, two-tailed versions of Student’s t-test to determine how much the sets of responses differed from each other; two-sample versions of the test were not used because the data in each set of responses varied greatly, and we wanted to determine whether two sets of responses varied in the same ways at the same times.
Video Analysis
We used camcorders to record the children’s interactions during class time and taped over 41 hours of video footage. We observed some behaviours among the children which Bauminger described earlier in her studies on social interaction among chil- dren with autism [Bauminger, 2002]. Inspired by her coding scheme, we chose to code five behaviours that we felt were potentially collaborative in the context of our robotics classes:
1. group proxemics, when groupmates stood within 120 cm, or what Hall describes as the limit of “personal distance” in conversational interaction, of each other [Hall, 1966];
2. shared gaze, when groupmates looked at the same object or at each other; 3. robot-related speech, how many times the children
talked about the robotic activities with either the experimenter or their group- mates;
4. pointing behaviour, or indicating the robots or computers to either the experimenter or groupmates through pointing at them;
5. shared positive affect, how many times the children would laugh or smile with groupmates.
By describing the above behaviours as “potentially collaborative”, we mean that we considered the children to be behaving collaboratively only if some instances of these behaviours co-occurred with other behaviours. Specifically, a child would need to exhibit one or more of the last three behaviours (robot-related speech, pointing, or shared positive affect) while they were both close to their groupmates and looking at the same object as them for us to have considered the child as collaborating. Otherwise, the observed instances in question would still be coded in our records, but they would not be considered collaborative behaviours. This was done because studies have shown that when group members are not in close proximity to each other and do not have face-to-face communication, they will have difficulty in collaborating [Kiesler and Cummings, 2002]. Furthermore, our own experiences in the robotics class showed us that the children were more apt to ignore their groupmates’ actions if they were not paying attention to them, were not close to them, or both.
To ensure inter-rater reliability, the above behaviours were coded by one of the experimenters as well as a second independent rater who coded 10% of the data. When the independent rater’s video codings were compared with the codings of the experimenter to see how well they agreed with each other, the average agreement value was 0.91, which is generally considered to be good. We also examined the above sets of codings for reliability and received an average value for Cohen’s kappa of κ = 0.72. This is acceptable, as having a Cohen’s kappa value higher than 0.60 suggests that the agreement observed between the raters is not due to chance alone [Bakeman, 1986]2.
We analyzed the above data for four different classes for each of the seven children that attended over 60% of the classes, which amounted to 25.55 hours
2Kappa values of 0.4 - 0.6 have been characterized as fair, 0.6 - 0.75 as good, and over 0.75 as
of data: their first class, their last class (because of the voluntary nature of the classes, the last day of class was not necessarily the same day for each child), and, according to the 7-point Likert scale questionnaires they filled out, the classes with the highest and lowest values for the children’s enjoyment, or the children’s most and least enjoyable/fun classes, respectively (see Figure 5.1). This was to determine whether the number of classes spent interacting with the same people and robots or the amount of enjoyment from a class affected collaborative behaviour. In order to get as much data as possible, we specifically did not allow the most or least enjoyable classes to overlap with the first or last classes in order to avoid the novelty effect during the first class and to avoid the last class which was very close to the start of the vacation period. This overlap could be avoided easily, since the children whose most or least enjoyable classes overlapped with the first or last ones had multiple classes that they reported as equally most or least enjoyable. In order to select the most or least fun classes from these multiple choices, the parents’ ratings for class enjoyability on these days were then used to decide which classes we would analyze. This procedure allowed the selection of the most and least fun classes to not overlap with the first or last classes. This makes sense, since as we will show in section 5.2.3, the answers on the parent’s questionnaires were not significantly different from those of their children’s.
We used Wilcoxon’s signed-rank tests to determine the significance of differ- ences in social behaviours on different days; we could not use paired t-tests because our paired sets of data were not from high enough populations, were not random, and were not normally distributed.
Semi-structured Interviews
A semi-structured interview is a data-gathering method in which an experimenter asks a series of guiding questions to steer an interview with a participant toward specific topics. However, it also allows for additional questions and topics to occur
naturally and be followed up during the course of the interview [Rosenthal and Ros- now, 2008]. We conducted a one-on-one, semi-structured interview with each of our study’s participants’ parents/carers after the last class and used a digital recorder to record what was said. During the interviews, we asked questions about changes in the children’s attitudes toward the robotics club, changes in the children’s col- laborative/social interaction skills in different settings, and the children’s diagnoses for ASD. Because we asked guiding questions in the interviews, we used analytic induction to interpret and categorize the answers given by the interviewees.
In addition to the above analyses, we also recorded the behaviours of two of the participants, M and Sh, as a case study evaluation during ten of the fourteen total robotics classes, hereafter referred to as S1 through S10, or the S-classes. While we would have observed them in more classes, the first two of the fourteen robotics classes were too chaotic for us to have gotten any useful data, the fourteenth class was conducted differently as described in section 5.2.4, and neither M nor Sh attended one other class. Specifically, we coded their behaviours according to the previously described coding scheme as well as described their behaviours in a more ethnographic manner.