An open problem when fitting an LMM is to decide upon a model for the error covariances Ri = σ2Λi. The simplest variant is to set Λi = I for all i = 1, . . . , n, leading to the CIM. In practice, however, allowing for heteroscedasticity and/or correlatedness of within-subject errors can improve the model fit substantially. This is for sure indispensable in ELMs with no random effects (besides the errors) but may also be advantageous in any LMM where the random effects specification captures the within-subject associations insufficiently. In any case the goal is not to devise the correct covariance structure (whatever that means) but rather to find a viable and economical approximation. Indeed estimating a full unstructured correlation matrix and heterogeneous variances is often self-defeating:
1. Parameters may be unestimable, especially for small datasets.
2. With increasing dimensionality (i.e., number of occasions) estimation quickly be- comes a computational burden.
3. Lots of parameters are wasted even when a much more parsimonious structural model would serve the same purpose.
This undesirable situation can be resolved by
a) checking thoroughly to what degree heteroscedasticity needs to be modeled, and b) imposing a sparse parametric pattern on the matrix of within-subject correlations. General considerations on the choice of variance and correlation structure are given in the following. A good strategy is probably to work out a few plausible models for the Ri and “let the data choose” be means of a selection criterion (see 3.2).
To describe structural covariance modeling, we decompose the covariance matrix of the within-subject errors into a variance part Vi and a correlation part Λi which are inde- pendent of one another:
Ri = ViΛiVi
where Vi is a diagonal matrix with strictly positive entries on the main diagonal, and Λi is symmetric and positive definite with all diagonal elements equal to one.
Variance Structures: For any subject i belonging to treatment group k, the variance part is Vi = σi1 . .. σim
with σij the square root of the variance at measurement occasion j. Now we obviously do not want to model subject-specific variances; patterns of practical relevance allow the variance to vary across measurement occasions j = 1, . . . , m and/or treatment groups k = 1, . . . , q i.e., we work with σjk. We consider four variance schemes of increasing complexity:
Variances are assumed constant across both treatment groups and measurement oc- casions. This is the simplest model but highly unrealistic in any actual longitudinal dataset.
• Heteroscedastic over time: σjk = σj
Variances changing in the course of time are a common occurrence in longitudinal studies, and it may be reasonable to assume that they are not considerably different in the treatment groups.
• Heteroscedastic over treatments: σjk = σk
Variances being constant over time but different from treatment to treatment are unlikely to occur in longitudinal data but possibly in other repeated measurement settings e.g., with multiple endpoints.
• Fully heteroscedastic: σjk
Variances are allowed to vary between measurement occasions and treatment groups. Such a detailed model will be difficult to fit and to motivate with small sample sizes but may be justified for larger datasets.
A rather parsimonious strategy to model heteroscedasticity could involve a variance func- tion of some sort e.g., an exponential or power variance function.
Correlation Structures: The second component of the residual covariance matrix Ri is the correlation part Λi. We list here some frequently used correlation patterns and discuss their applicability to longitudinal and repeated measures settings. The matrices are exemplified for m = 4, and since they are symmetric, only their upper triangles are displayed. The restriction |ρ| ≤ 1 applies to all correlation parameters with or without subscripts. • Independence (IND): Λi = Im = 1 0 0 0 1 0 0 1 0 1
The most naive way to deal with a repeated measures situation is to flatly ignore any correlation among the time points. The assumption of independent errors (which is implicit in the standard linear model) is highly unrealistic for longitudinal or any other correlated data and will lead to grossly invalid standard errors (SEs).
• Compound symmetry (CS): Λi = 1 ρ ρ ρ 1 ρ ρ 1 ρ 1
Compound symmetry requires just one parameter ρ to be estimated but on the other hand implies that all measurements are equally correlated. This is a questionable assumption for longitudinal data, where the strength of association is likely to decrease with increasing separation in time.
• First-order autoregressive (AR(1)): Λi = 1 ρ ρ2 ρ3 1 ρ ρ2 1 ρ 1
AR(1) is equally parsimonious in parameters as CS but able to reflect that corre- lation decreases (exponentially) with increasing time gaps between occasions. This makes it a favored pattern for longitudinal data. As a limitation, it requires that measurements are obtained at equally spaced points in time. This restriction can be overcome with the generalization to CAR(1). Higher-order autoregressive struc- tures are conceivable but rarely realized in practice.
• Continuous first-order autoregressive (CAR(1)):
Λi = 1 ρ|t2−t1| ρ|t3−t1| ρ|t4−t1| 1 ρ|t3−t2| ρ|t4−t2| 1 ρ|t4−t3| 1
The continuous generalizaton of AR(1) is appropriate if the measurements are not equally spaced in time as they take into account the lags |tj0 − tj| between time
points tj and tj0, with j, j0 = 1, . . . , m. For data with constant lags, CAR(1) is the
same as AR(1). • Toeplitz (TOEP): Λi = 1 ρ ρ2 ρ3 1 ρ ρ2 1 ρ 1
Toeplitz structures assume that correlation of (equally spaced) occasions varies with their separation in time. Unlike with AR(1), however, there is no restriction to exponential decay. This flexibility comes at the cost of having to estimate m − 1 correlation parameters instead of just one.
• Unstructured (UN): Λi = 1 ρ12 ρ13 ρ14 1 ρ23 ρ24 1 ρ34 1
A completely unstructured pattern will reflect the data’s correlation structure most accurately, thus minimizing the risk of misspecification. However, the absence of constraints for the matrix elements inflates the number of parameters to m(m+1)2 . Further correlation patterns include higher order autoregressive (e.g., AR(2)), moving average (MA), autoregressive moving average (ARMA), antedependence (ANTE), factor analytic (FA), spherical, Huynh-Feldt (HF), and various spatial structures; in addition, banding can be introduced where all entries in higher off-diagonals are set to zero. For an overview of covariance patterns for longitudinal and repeated measures designs consult e.g., Jennrich and Schluchter (1986), Diggle et al. (1994), Wolfinger (1993), Wolfinger (1996), and Littell et al. (2006).