The studies in this thesis challenged the theoretical substantiation as well as the practical scope of the modality principle. In the previous sections I pointed out, how the cognitive theories and the design principles derived from them may be further specified in order to account for the current results. However, some aspects of the results of the studies need to be corroborated through further research and other aspects can be expanded into new directions.
First of all, I concentrated on a well-established learning material to provide a maximal comparability with other studies on the modality effect (e.g. Mayer & Chandler, 2001; Mayer & Moreno, 1998; Moreno & Mayer, 1999). The multimedia instruction on the formation of lightning can surely be considered to be a prototypical case of the application of dynamic visualizations. However, the studies are worth being replicated with other learning material. For example, replicating the studies with the materials used by Sweller and his colleagues will help specifying the impact of my findings on the current theoretical approaches. Getting the same results with their instructions would indicate the generalisability of the interactions of pacing and its control with text modality and further stress the importance of visual processes in multimedia learning.
Moreover, the method of eye tracking can be applied to different learning material. In the material used in the studies of this thesis the patterns of viewing behavior highlighted the role of text comprehension. Concentrating on the verbal explanations appeared reasonable in the present instruction since the visualizations were fairly concrete. However, one can easily imagine more complex and/or abstract pictorial information, for example electric circuits or statistical graphs that require more processing resources. Similarly, text difficulty depends on the text structure and the subject matter. Furthermore, multimedia learning material can differ with respect to the referential connections between text and visualizations. It can be assumed that all these characteristics of a learning material affect the learner’s viewing behavior. The differences in viewing behavior can help estimating the relative load of
verbal explanations in comparison to visualizations and the amount of “element interactivity” (Tindall-Ford et al., 1997) across different learning materials.
There is another exciting design characteristic of multimedia instructions in which eye tracking must be employed. In order to reduce the perceptual and cognitive load caused by high element interactivity, visual cues can be used to guide visual attention to appropriate referents (Kalyuga et al., 1999). When those design features are purposely introduced, observing the actual fixation paths allows to evaluate if these features were effective in advancing attention allocation and reducing visual search.
Another aspect of eye tracking is that it offers an extensive database. In fact, there are numerous ways in which those data can be analyzed. For example, in reading research viewing behavior is usually described in terms of gaze durations or even single fixations on words and the saccadic movements between these gazes or fixations (e.g. Just & Carpenter, 1980). Such analyses are accompanied by theoretical models accounting for eye movements on the same level of description (e.g. Reichle, Pollatsek, Fisher, & Rayner, 1998). These fine-grained cognitive process models can be tested by tracing the eye-movement protocol (e.g. Salvucci & Anderson, 2001). Matter-of-factly, current theories on the integration of verbal and pictorial information are less elaborated. But they may be advanced in order to allow predictions of fixation paths based on an accurate model of the learning process.
Considering a more practical aspect, the research questions of my thesis must be successively extended to broader classes of multimedia learning material in order to estimate the practical scope of the findings. There already exist some studies that varied the pacing of instruction and the matter of learner- control in multimedia learning with a linearly structured website containing texts and diagrams (Tabbers, 2002). Although the material was much more complex than the instruction used in this thesis, the effects on cognitive load and learning outcomes reported in these studies are fairly in accordance with the present results. Most notably, however, Tabbers found a “reversed” modality effect. With self-paced instructions students learning from a version where text accompanying a diagram was presented on- screen outperformed those students who received spoken text. I deduced the possibility of a reversed modality effect from considerations based on the viewing behavior observed in system- vs. self-paced instructions. The rationale for such an effect is that reading is more accessible for deliberate strategies for remembering expository text than listening. This difference may not affect learning success with single instructions of an approximate length of 3 minutes. However, written text appears superior to spoken text when the amount of displayed information is increased, as done in the studies by Tabbers. The average time on task learners spent in his studies was above 20 minutes. It appears worth examining the amount
of content and especially the amount of text that is necessary to evoke such a reversed modality effect in self-paced instructions.
The results obtained with self-paced instructions also underline the importance of extending the research to more interactive learning environments. As supported by the studies referenced in the previous paragraph, effects that apply under more strict system-paced conditions might not work or have different outcomes when learners interact with the program (Tabbers, 2002). Thus, other forms of interactivity than control over the pacing should be investigated as well. For example, giving the learner the choice over the mode of text presentation (Plass, Chun, Mayer, & Leutner, 1998), or offering further navigational interaction like stopping and replaying, allow the learner to adjust a presentation to individual preferences. How those interactions further promote learning or if some of these interactions put an additional cognitive load on the learner has yet to be investigated.