Elementos regionalistas en La gaviota (1841)

El paisaje en el regionalismo andaluz

4. Elementos regionalistas en La gaviota (1841)

In the real world data is often nested. For example, trials are nested within participants, students are nested within schools, or voters are nested within districts. Models need to account for these patterns to accurately capture real effects. Here I will focus on the case relevant for the empirical chapters, where multiple participants complete many trials inside an experiment.

Say we have a sample of 20 participants completing 500 trials each in a psychophysics

experiment and we want to find out whether self-reported confidence increases with response times. Historically, researchers would have either estimated a single regression line for all of the

trials (a pooled model) or analysed each participant individually (an unpooled model). Both of these approaches are suboptimal. The pooled model is suboptimal because it conflates between- and within-participants effects. Returning to our example, suppose that slower participants tend to be more confident. However, within each participant faster trials are associated with higher confidence ratings. By pooling the data we obscure at least one of these effects, depending on which one is dominant, even though there are two very real effects in our data. The problem with the unpooled model is that it is underpowered because it treats people as independent. That is, we model the data as if the first 19 participants provide no information about how participant 20 will behave. Therefore, the unpooled approach needlessly impairs our ability to draw strong conclusions from the data.

There are 3 ways to account for nested data in a GLM framework: allowing the intercepts to vary, allowing the slopes to vary, or allowing both to vary. We will consider a hierarchical model with varying intercepts first.

∀𝑖 ∈ {1,2, … , 𝑛} 𝑌𝑖 = 𝛽0,𝑖+ 𝑋𝑖𝛽1+ 𝛦𝑖

The model above estimates one intercept for each participant but only estimates a shared slope parameter for all participants (see Figure 2.1. b). This addresses the risk that effects operating at different levels in the data occlude each other in the model. In terms of the example above, each participant would get their own intercept, capturing that slower respondents were more

confident, but there would only be one single slope, capturing that within people faster responses tended to be more confident. However, this approach is still computationally inefficient because it assumes that each intercept is completely independent from the others. This can be addressed by adding a second level to the model, which draws the intercepts at the first level from a distribution of intercepts. The shape of this distribution will influence our results. For convenience, in this example we assume that the intercepts are drawn from a normal distribution:

𝛽_𝑖~𝑁(𝜇, 𝜎2₎

Both levels of the model are fitted simultaneously, so that if all of the participants had similar mean response times, σ2_{would be small, constraining the possible intercepts for outliers. On the} other hand, if the mean response time differed a lot between participants, σ 2_{would be large, so} the hierarchical model would give similar results to the unpooled model. Hierarchical models

have the added benefit that they can be expanded to account for predictors that work on the higher levels in the data structure. In the context of our example, maybe the researchers suspect that participant age influences confidence. This could be tested by adding age (with a slope parameter) into the higher level of the model:

𝛽_𝑖~𝑁(𝜂₀+ 𝜂₁𝑢_𝑖, 𝜎2)

Where ui indicates the age of the participant, η1 is a slope parameter and η0 is an intercept term. The same principle applies if we extend the model to slopes: just as the intercepts above, they can also be allowed to vary by participant, with each slope drawn from a distribution (Figure 2.1. c). Predictors can be added at any level of the model and additional levels can be added (e.g. perhaps we believe that gender influences confidence judgments so we want to nest our participants by gender).

An alternative to maximum likelihood estimation for hierarchical models is Bayesian estimations. In Bayesian estimation the information in the likelihood function is combined with prior

information the researcher might have about the data (represented as a distribution). This can be useful when the researcher has a lot of information that is not captured in the data set. However, in cases with large data sets and with diffuse prior distributions, the likelihood tends to dominate the Bayesian computation so that Bayesian and maximum likelihood estimates are very similar. There are 3 important practical advantages with the Bayesian approach: 1) As mentioned above the prior constrains the likelihood, this gives the researcher freedom to implement information from outside of the data in the modelling in a principled way. 2) Bayesian statistics treat all unknown quantities probabilistically so everything from parameter estimates to predictions of new observations are treated as distributions, representing the uncertainty in the model and the data. 3) The software tools implementing Bayesian analysis tends to be more flexible than those implementing pure maximum likelihood approaches so it is easier for researchers to build models that are optimised for their datasets and research questions.

To summarise, the hierarchical GLM framework is both flexible and powerful. Its capacity to evaluate both within-participants and between participants effects simultaneously makes it particularly well-suited to explore questions related to metacognition, where the norm is that a small number of participants complete a large number of trials.

In document Landscape and Nation in the Formation of the Spanish Liberal State 1850-1890 (página 118-120)