The experimenter, although primarily engaged in entering the subject’s commands into the computer nevertheless had plenty of time to note down any interesting observations. In the diagram of the experimental setup shown earlier, it can be seen that the subject faces slightly away from the experimenter, meaning that whilst the subject cannot see the experimenter’s screen, the experimenter has a full view of the subject’s screens and is thus completely aware of the current state of the carwash. In addition, direcdy facing the experimenter, albeit slightly obscured by his terminal, is a video monitor showing the subject’s face.
2 . 6 . 3 . Tertiary Measures (for interest)
There is only one tertiary measure, and it is classified as such since it is not at all contributory to testing the stated hypotheses of the investigation. This measure is concerned with individual differences and is included for interest. It is largely taking advantage of the design of the study. (The experimental design, since it utilises the same first trial for all subjects, is able to provide data on individual differences at no extra cost.)
The Cognitive Failures Questionnaire (CFQ) was originally proposed by Broadbent et al (1982), and has subsequently been used by many others (e.g. Reason 1988). It is a set of 25 questions, each answered on a scale of zero to four, concerned with the frequency of cognitive failures. The questionnaire is reproduced in Appendix D, and the reader interested in its content is directed there.
It has been proposed as a measure of ability to do complex tasks, since it correlates well with performance on these, but not at all with performance on traditional simple
II
example of such complex behaviour, and if successful, the CFQ could serve in the future as an a priori indication of multitasking ability. Subjects were given the questionnaire to fill in as the final part of the experimental session.
3. Results
3 . 1 . Level of assertions
The study was designed to allow the gathering of quantitative data. These data haves been analysed and the hypotheses of interest subjected to statistical test. In many cases, it was not possible to reject the null hypothesis at the 0.05 level typically accepted in Experimental Psychology (Cognitive Ergonomics had no convention of its own and this seemed the most appropriate to use).
Multitasking behaviour is very complex. Even in a relatively simple and controlled job such as that of a carwash manager used here, there is a great deal of scope for different strategies and tactics. The variability in performance is apparent in the data. Fig 7.5 shows, for all subjects, the initial idle period (until the first car is driven in), and also the unadjusted time of driving the last car out (see Section 2.6.1. A). It can be seen from this that there is a wide range of times taken. The graph also serves another purpose. The data points are arranged according to the four conditions, and it is apparent that, broadly, the pattern of times does not differ between groups. It is thus reasonable in the rest of this chapter to compare performance across conditions. It is interesting to note that two of the three longest times for driving the last car out correspond to unusually long initial idle periods, giving support to the decision to utilise the difference between the two. The dotted line on the graph is the average of this adjusted strategy time, and it can be seen that there is very little difference between groups.
Graph of raw time d a ta for all groups 600 -r 500 -- ♦ ♦ 400 - - ♦ ^
s
♦
<D + ’ E 300 -■--- ♦ ♦ ♦ ♦ ____ 200 - - 100 - - X XB ase + Sol B ase + PR + Sol B a se B ase + PR
Experimental Group
♦ Time, in ticks, of last car out X Time, in ticks, of first car in
Group average time taken (last car out minus first car in)
Fig. 7.5. Graph of the unadjusted time data for all groups.
Even given these adjustments, there is still variability, and it is believed that this is the main reason for the inability to support statistically many of the trends which, it will be argued, appear to be present. The level of assertion associated with the following results is therefore necessarily low. However, there are nevertheless visible trends in many of the primary measures, and it is the intention to reinforce these by considering them in combination with the secondary measures, some of which are quantitative, others of which are qualitative.
The statistical tests performed were non-parametric, largely because the fewer
assumptions underlying these tests were more compatible with the above manipulations. Unless stated otherwise, tests for a difference in two populations were Mann-Whitney U tests (ni = n2 = 8, U must be < 15 to reject Hq with p < 0.05).
Finally, to partially overcome the problem of individual differences, it was decided to adjust an individual’s performance measures to be relative to their performance on the first trial (it will be remembered from the experimental design that this is identical for all subjects in all groups). In the following figures, which show an average of this adjusted
measure, within a condition, first trial performance is therefore always represented as 1. Performance on subsequent trials is relative to this, so if, for example, the average relative improvement in time (i.e. the time taken was less) on trial 2 was 25%, then this would be represented as 0.75.
3 . 2 . Considering the Manipulations individually
The main question to be addressed in this section concerns whether the presence of either manipulation, or both together, had any noticeable effect. The relative effects of these manipulations will be considered later.
Each of the three experimental conditions will be considered in turn, relative to the base, control condition, starting with the PR condition, followed by the Sol condition and the combined PR + Sol condition. Each condition is presented in terms of the primary data (Strategy Time and Free Time), then the secondary data (Perceived Efficiency and Difficulty, and Direction of Gaze), and then also any other comments or observations made by the experimenter. However, to simplify this data presentation, when
considering Direction of Gaze, only the information pertaining to the screens of interest will be given. This, of course, can only be appreciated in the context of the direction of gaze for the remaining time. This information is presented once, now, rather than repeated for each case below.
The video recordings of subjects were analysed for gaze in five directions. These are the MAIN screen, the PR screen, the SOL screen, the DESK, and OTHER (which includes staring the the ceiling, walls and floor etc). Fig. 7.6 shows the average proportion of time spent looking in these directions for the four conditions on trial 1. It can be seen that the greatest proportion of time is spent looking in the direction of the main screen (average 88%), with smaller proportions accounted for by the desk and ‘other’ (9% and 3% respectively). Two points in particular need to be made:
• In the conditions where the PR and Sol screens are present, any time spent looking in the direction of these screens is accounted for by a general decrease in time spent gazing in the other directions.
• The pattern of time spent looking in the three basic directions remains the same on trial 4, with no perceivable differences between conditions. Thus there is no indication of any after effect on this trial - for example looking in the direction of a screen no longer present. There is a slight shift in balance across all groups so that slightly less time is spent looking in the direction of the main screen (average 86%) and at the desk (7%) and more time spent looking elsewhere (6%). These differences are only slight.
Pr+Sol
MAIN SOL PR Desk Other
screen scr een screen
Fig. 7.6. Histogram showing the average proportion of time spent gazing in five directions, for all four conditions, trial 1.
3 . 2 . 1 . The PR Condition A. Primary Measures i) Strategy Time
Average (S.D.)
Condition Trial 2 Trial 3 Trial 4
Base 0.95 (0.09) 0.92 (0.05) 0.93 (0.09) PR 0.92 (0.07) 0.90 (0.09) 0.94 (0.09)
U (* = significant) 32 30 31
Table 7.6. Strategy time data for Base and PR conditions.
Strategy Time 1.00 0 .9 5 0 .9 0 0 .8 5 0 .8 0 T4 T1 T2 T3 ■<>■ B ase+PR ■X- B a se
Fig. 7.7. Graph showing the average relative strategy time over four trials for the Base and PR conditions.
Although there are no statistically supported differences in the above, there is a slight visible trend in the direction which would be expected if the PR manipulation were having its predicted effect on efficiency measured in this way, i.e. performance on trials 2 & 3 is relatively better for the PR group, compared with the Base, control, group. The apparently identical data for trial 4 adds weight to the notion that there may be such an effect, albeit a small one.
ii) Free Time
Average (S.D.)
Condition Trial 2 Trial 3 Trial 4
Base 1.10 (1.09) 1.10 (1.19) 1.79 (1.89) PR 2.78 (1.28) 3.23 (2.45) 2.17 (2.48)
U(* = significant) 6* 11* 29
Table 7.7. Free time data for Base and PR conditions.
Proportion of Free Time
5 .0 0 4 .0 0 3 .0 0 2.00 1.00 0.00 T4 T2 T1 T3
Fig. 7.8. Graph showing average relative free time (as a proportion of strategy time), over four trials for the Base and PR conditions.
Here, the data is statistically more supportive. Performance, on this measure, is better when the manipulation is present, shown as a higher proportion of free time, returning to base level on trial 4 when it is removed.
X
B. Secondary Measures