Task Outcomes - Experiment: Four Consumer Products

Chapter 7 Experiment: Four Consumer Products

7.3 Results

7.3.4 Task Outcomes

For each task, there were three possible outcomes: (1) the task was not attempted; (2) the participant successfully completed the task, and (3) the participant failed to complete the task by either thinking that they were successful when in actuality they failed, or by giving up while performing the task. Figure 7-19 shows the proportion of participants who attempted, succeeded or failed the task with each of the consumer products.

Of the 16 participants who performed the clock radio task, 56% successfully completed the task, and of the 16 participants who performed the mobile phone task, 19% completed it successfully. Of the 19 participants who performed the blender task, 100% successfully completed the task, and of the 18 participants who performed the vacuum cleaner task, 100%

159

completed it successfully. Thus the mobile phone task and the clock radio task had the highest and second highest failure rate respectively. This is summarised in Table 7-10.

Table 7-10 Proportion of participants performing, successfully completing and giving up during tasks

Product Performed Task Successfully

Completed Failed/Gave Up Clock radio 16 of 19 participants

(84%)

9 of 16 participants (56%)

7 of 16 participants (44%)

Mobile 16 of 19 participants (84%)

3 of 16 participants (19%)

13 of 16 participants (81%)

Blender All 19 participants (100%)

19 participants (100%)

0 of 19 participants (0%)

Vacuum cleaner 18 of 19 participants (95%)

18 participants (95%)

0 of 18 participants (0%)

Figure 7-19 Graph of proportion of participants attempting, failing and being successful with each of the four tasks

Participants 8, 15 and 17 could not perform the clock radio and mobile tasks due to visual problems. Of the 7 participants who failed the clock radio task, 6 failed by giving up when they couldn’t achieve the goal, while the remaining participant thought she achieved the goal when in actuality she did not. Of the 13 participants who failed the mobile phone task, all gave up after trying to reach the goal.

7.3.4.1 Mean Rated Difficulty for Each Task

Mean difficulty and frustration ratings were plotted for each product and compared. Figure 7-20 shows the mean ratings for the clock radio task. For this task, the most difficult actions were cognitive actions such as figuring out how to start the task and how to move on to the next action. Seeing the text ‘PM’ on the clock radio screen was also rated as being relatively difficult. Overall mental demand was rated higher than physical demand.

160

Figure 7-20 Mean difficulty ratings for clock radio task (error bars 95% confidence interval)

Figure 7-21 shows the mean ratings for the mobile phone task. As for the clock radio task, participants rated cognitive actions such as figuring out how to start the task and how to move on to the next action much higher than all other actions. Overall mental demand was also rated quite high in relation to overall physical demand. Seeing the text and the buttons on the phone also proved difficult for participants, and participants were fairly frustrated after performing this particular task.

Figure 7-21 Mean difficulty ratings for mobile phone task (error bars 95% confidence interval)

161

Figure 7-22 shows the mean ratings for the blender task. In this task, the physical actions of opening/closing the cover of the blender were rated the most difficult. Lifting and pouring from the glass jug were also difficult motor actions. Overall physical demand ratings were higher than overall mental demand ratings, but not by a significant amount.

Figure 7-22 Mean difficulty ratings for blender task (error bars 95% confidence interval)

Figure 7-23 shows the mean ratings for the vacuum cleaner task. Figuring out how to start the task was rated the most difficult action, as participants found difficulty finding the power switch on the vacuum cleaner. Figuring out the next action in the sequence of operating actions and overall mental demand were also rated as being more difficult than the other actions. Participants rated overall physical demand, including pushing and pulling the vacuum cleaner, as the most difficult physical actions in the task. The vacuum cleaner also posed visual challenges to users in seeing the buttons and text on the chassis.

Across all products, the mobile phone had the highest mean ratings for difficulty in starting the task (M=78.24, SD=30.00), difficulty in working out subsequent actions (M=84.59, SD=28.96), and overall mental demand (M=76.47, SD=30.25). The mobile phone also had the highest mean rating for frustration experienced during the task (M=48.89, SD=41.82). In terms of visual demands, the small text on the clock radio display was rated the most difficult to see (M=52.37, SD=34.78), followed by seeing the numbers on buttons (M=46.05,

SD=36.84) and seeing the actual buttons (M=39.47, SD=41.16) on the mobile phone.

162

Figure 7-23 Mean difficulty ratings for vacuum cleaner task (error bars 95% confidence interval)

The physical actions of opening (M=47.37, SD=30.25) and closing (M=38.68, SD=26.03) the blender cover and pushing the vacuum cleaner forward (M=28.06, SD=30.69) were also rated as being the most difficult actions. In terms of overall mental demands, the mobile phone ranked the highest (M=76.47, SD=30.25), followed by the clock radio (M=42.94, SD=37.54), the blender (M=28.11, SD=32.58) and the vacuum cleaner (M=27.94, SD=27.60). For mean frustration ratings, the mobile phone once again ranked the highest (M=48.89, SD=41.82), followed by the vacuum cleaner (M=28.33, SD=36.22), the clock radio (M=26.39, SD=39.91) and the blender (M=22.11, SD=36.03). The mobile phone provided the greatest cognitive challenge, while the vacuum cleaner provided the greatest physical challenge to participants.

The error bars on all the four graphs of mean difficulty ratings were quite large, possibly due to the small sample size of the study. The results therefore can be interpreted as indicative, with a larger sample size being required for statistically significant results.

7.3.4.2 Times and Errors

Histograms of task times (in seconds) are shown in Figure 7-24 for each product. The vacuum cleaner has the longest average time (M=169.94, SD=77.11), followed by the blender

(M=140.89, SD=86.12), then the mobile phone (M=128.31, SD=99.59) and finally the clock radio (M=93.56, SD=58.26).

163

Figure 7-24 Histograms of task times for each of four products

The clock radio and mobile phone tasks had the highest failure rate, and participants in general gave up quickly after finding the task too difficult. This explains the shorter mean task times. Figure 7-25 shows histograms of task errors for each product. The mobile phone recorded the highest mean errors (M=3, SD=2.63), followed by the clock radio (M=1.5, SD=0.89), the blender (M=1.26, SD=1.41), and finally the vacuum cleaner (M=1.11,

SD=1.32). The high frequency of one error for the clock radio task was due to participants not being able to figure out that a ‘time set’ button must be held in order to set the time, and thus entering a trial and error exploration of the product interface. This acted as a barrier for successfully completing the task. Similarly, the high frequency of one error for the mobile phone task is due to some participants not being able to figure out the correct menu option once the phone menu had been entered. The random exploration that ensued culminated in task failure. Designers therefore need to be aware of introducing these barriers in product design where ‘hold and set’ strategies could set up an insurmountable barrier to task completion.

164

Figure 7-25 Histograms of task errors for each of four products

Figure 7-26 shows scatter plots of task time versus task errors for each of the four products used in the study. For the mobile, blender and vacuum cleaner, the graphs demonstrate a positive relationship i.e. the task time increases as the number of errors increase. The mobile phone graph shows the best linear relationship of the three with r(14)=0.726, p< 0.01. It indicates that in these cases, there are no ‘speed versus accuracy’ tradeoffs occurring where faster task times lead to more errors and vice versa. The clock radio scatter plot shows a weak, statistically insignificant negative relationship: r(14)=-.149. Participants making one error in this task demonstrated a large variation in task time which is responsible for skewing the graph. Given the distribution of the other points, it can be assumed that there is no ‘speed versus accuracy’ tradeoff occurring in the clock radio task, despite the negative correlation coefficient.

Participants 11 (retired technology consultant) and 13(retired army officer) tended to perform well in the product tasks with relatively low task times and errors. Participant 11 was also the only participant to succeed on all product tasks. This may have been possible due to their previous experience and training with technology.

165

Figure 7-26 Scatter plots of task time versus errors made for the four product tasks

In document Exploring a capability-demand interaction model for inclusive design evaluation (página 159-166)