idóneos. Objetivo:
5. Funcionamiento participativo y descentralizado de SEAR
The final discussion presented here concerns the method of statistical analysis used for eye movement studies of reading. In 1973, Clark provided a critique of the statistical procedures used in studies of language; specifically, he proposed that, in addition to considering participants as random variables (in which it is acknowledged that the research outcomes need to be generalised beyond the individual participants to the wider population), researchers should also treat language materials as random variables. Indeed, not only do individual participants vary at a range of levels (due to factors such as genetic, developmental, environmental, social, or political
influences), but experimental stimuli such as sentences also vary on a range of levels (such as variations in words, syllables, and language). Thus, both participants and language materials should be considered as random variables within the analysis. This recommendation by Clark (1973) resulted in many studies then reporting statistical analyses (typically in the form of analysis of variance, ANOVAs) for both the participants (F1) involved within the study and the items (F2, the specific set of words or sentences) used within the study (Raaijmakers, Schrijnemakers, &
Gremmen, 1999). This was based on a widely held assumption that research findings could be generalised to both the participant population and the language as a whole, if the participant and items analysis were conducted separately (Raaijmakers et al., 1999). Whilst such practice was not actually in line with the recommendations of Clark (1973), this became the typical format for eye movement analysis for studies of reading. The original issues raised by Clark et al. (1973) were not addressed and, in many cases, it was still unclear whether research findings could in fact be
generalised to both the participant population and the language.
A new statistical approach, linear mixed modeling (LMM), has recently become popular within the eye movement and reading literature (Baayen, Davidson,
55 & Bates, 2008; Barr, Levy, Scheepers, & Tily, 2013), as it is able to address these issues raised by Clark (1973). Linear mixed models (LMMs) can include both random and fixed effects within one model, with both effects being modeled as having a linear form. Fixed effects are the variables that have been purposely manipulated and thus have an a priori theoretical motivation for statistical analysis (Pinheiro & Bates, 2000); for example, the fixed effects reported in the current thesis were the manipulations of parafoveal preview and also the reading group. In contrast, random effects are variables that typically occur when individual data points cluster together, via association with a set of entities, but are not variables that have been manipulated. For example, the random effects reported in the current thesis were participants and items (sets of stimuli), as the data points can be grouped by individual participants and individual items.
It is the inclusion of these additional random effects alongside the fixed effects that makes the linear mixed model a mixed model and a popular method within research into language. Adding the random effects into the linear model provides structure within the model error and allows for the variation in the data to be characterised. Thus, in studies of language, as reported in this thesis, both participants and items can be included as random factors within the same model, which then characterises the variation in the data that is due to individual differences in both participants and the selected stimuli. Therefore, analysis conducted using LMMs does address the concerns raised by Clark (1973), by allowing both
participants and items to be considered as random variables within the same model. This means that, research using LMMs allows for the research to be generalised across both participants and items and is one of the reasons that LMM’s are now becoming the preferred method of analysis for researchers exploring eye movement behaviour during reading (e.g. Marx et al., 2016; Marx, Hawelka, Schuster, & Hutzler, 2017; Pagán et al., 2016; Tiffin-Richards, & Schroeder, 2015).
In addition to the ability to include both fixed and random effects within one model, LMMs are also known to accommodate for instances of missing data (Gurka & Edwards, 2011; Kutner, Nachtsheim, Neter, & Li, 2005; Smith, 2012; West, Welch, & Galecki, 2007), which is not the case for more traditional methods of analysis such as the ANOVA. Indeed, missing data and unbalanced designs occur regularly when testing special populations, as data collection can be often
56
Furthermore, missing data occurs regularly during eye movement studies of reading, particularly in boundary paradigm studies where a strict exclusion criterion is
followed and portions of the data have to be excluded to ensure accurate data. In line with a range of boundary paradigm studies (Angele & Rayner, 2011; Angele,
Slattery, Yang, Kliegl, & Rayner, 2008; Chace et al., 2005; Hӓikiӧ et al., 2010; Johnson et al., 2007; Kliegl, Risse, & Laubrock, 2007; Marx et al., 2015; Pagán et al., 2016; Pollatsek et al.,1992; Tiffin-Richards & Schroeder, 2015), a strict exclusion criteria was followed to ensure that the gaze contingent change worked efficiently and effectively in presenting a parafoveal preview independently to the foveal preview (see the experimental chapters for detail on the individual criteria). Therefore, due to the nature of this thesis (exploring eye movement behaviour for readers with dyslexia using the boundary paradigm), it was considered highly likely that there would be instances of missing data and that using LMMs to analyse the data would be beneficial. For these reasons, similarly to a range of studies in eye movements and reading (e.g. Bélanger, Mayberry, & Rayner, 2013; Hawelka et al., 2010; Kirkby et al., 2011; Marx et al., 2017; Pagán et al., 2016; Sperlich et al., 2015; Tiffin-Richards, & Schroeder, 2015; Yan et al., 2013), LMMs were selected as the main method of analysis for the eye movement data.