CAPITULO II: PLANTEAMIENTO OPERACIONAL
4. Esquema experimental
4.3. Diseño de experimentos – diseños estadísticos
As stated in Section 3.3.3, quasi-likelihood methods are potentially biased, and therefore this section considers model selection and assessment only for the max- imum likelihood methods. Zuur et al. (2009) suggested a top-down strategy to fit mixed-effects models, and their general guidance for this strategy is summarized by the following steps:
3.3 Mixed-effects logistic regression model 67 • step 1: start with a model containing as many fixed-effects explanatory vari- ables as indicated by the data and potential interactions, which is referred to as a‘beyond-optimal’model (compare to the final selected‘optimal’model); • step 2: decide the structure of the random effects based on the beyond-optimal
model from step 1;
• step 3: given the random-effects structure decided in step 2, select the fixed- effects predictors; and
• step 4: assess the selected model with the random-effects variance structure from step 2 and fixed-effects predictor from step 3 in terms of its goodness-of- fit, normality assumption of random effects, and other related aspects.
It is important to note that the above steps are considered only as a general protocol for fitting mixed-effects models, and that these steps usually need to be adjusted according to the observed data together with the prediction inference of the statistical analysis, if that is the practical objective of the statistical analysis. The following lists a further discussion of the statistical tools or issues within each step:
• In step 1: it is important to note that it should be feasible to fit the beyond- optimal model by the chosen statistical software. In the case of large data set with lots of explanatory variables and potential interactions, plotting the data usually gives a good idea of how to make a selection of predictors that contributes to explaining the variation of the responses.
• In step 2: given the beyond-optimal model fitted in step 1, it is expected that all the fixed-effects predictors make their contributions to explaining variation of responses such that the random-effects coefficients do not contain any infor- mation from fixed effects predictors. Then the problem becomes that of testing whether or not a random-effects variance component is significant. Likelihood ratio tests can be applied here, but they are not appropriate for testing the stan- dard deviations of random coefficients. To explain this, letσ02 denote the vari- ance of a random intercept. Then the null hypothesis is H0 : σ20 = 0 for the
simpler model against the alternative hypothesis H1 : σ02 > 0 for the more
complicated model. In this case, the null hypothesis of the likelihood ratio test onσ2
68 Chapter 3. Logistic regression models with application to anglerfish
variance, so the test is referred to as a constrained likelihood ratio test. For con- strained likelihood ratio tests, the distribution of the test statistic under the null hypothesis is no longer a chi-squared distribution; instead, it is a mixture of chi-squared distributions whose form depends on the specific case (see Self & Liang, 1987; Molenberghs & Verbeke, 2007, for more discussion). Pinheiro & Bates (2000) pointed out that likelihood ratio tests on random-effects variance are conservative and in the case of a single random-effects standard deviation, the p-value of the likelihood ratio test is approximately twice as large as it should be. Unlike the constrained case for testing random-effects variances, thep-values are correct for testing random-effects correlations using the like- lihood ratio test.
The estimates of random-effects variance may be zero, even in the case when the true variance is not zero. In the case where the variance estimates are very small or the absolute values of correlations are very close to one, this indi- cates that the assumed random-effects variance structure cannot be identified or over-fitted given the observed data set. This could be a problem caused by lack of information contained in the data for fitting the intended model with the complex random-effects variance structure.
• In step 3: once the random-effects structure is decided in Step 2, the next step is to select fixed-effects predictors. Wald tests are usually used for this purpose in fixed-effects models, however, the p-value for the tests with H0 : βs = 0 versus H1 : βs 6= 0 is not as straightforward as the one in fixed- effects models. First, the test statistic does not have a t-distribution under the null hypothesis, because the independence of observations within each group is no longer assumed. Second, the degrees of freedom of the denominator for penalizing certainty are unknown for mixed-effects models, as the degrees of freedom for a random-effects parameter can be counted as 1, or some value between 1 and the total number of level-two units (see Hodges & Sargent, 2001, for more discussion).
Therefore, for testing the significance of a particular explanatory variable, the likelihood ratio test is then suggested for comparing two nested models with the same random-effects variance structure: one with the predictor of inter- est and the other one without. It is important to note that the corresponding
p-value is only a guide to the significance of the particular predictor. If the research question is about the significance of this predictor, then it is strongly
3.3 Mixed-effects logistic regression model 69 suggested that Monte Carlo based methods or the parametric bootstrap be used for drawing a conclusion about the effect of this predictor.
• UsingAIC in step 2 and 3:
AIC is widely used for model selection in terms of the relative goodness of fit of a model. It is defined as the maximized log-likelihood penalized by the degrees of freedom of the fitted model. Such a definition raises two issues of usingAIC for mixed-effects models.
– First, the likelihood for random-effects logistic regression models is ap- proximated by numerical integration methods, such as Laplace approxi- mation or adaptive Gaussian quadrature. The latter can be thought of as a higher-order Laplace approximation in the case of multiple quadrature points. Therefore, if the fitted models being compared are not approx- imated to the same order in the numerical integration, then we cannot be sure whether the difference in their likelihoods is caused by the dif- ferent model structure or the different accuracy level in the numerical integration. It is important to note that the AIC of a fixed-effects model is not commensurate with the AIC of a corresponding random-effects model with the same fixed-effects component. Taking a random-intercept mixed-effects model for example, theAICof the random intercept model should not be compared with a fixed-effects model (without the random intercept) when deciding the significance of the random intercept. Simi- larly, it is not suggested to compare AICs of mixed-effects models when the models use the adaptive Gaussian quadrature method with different quadrature points.
– Second, counting the degrees of freedom for random-effects variance parameters is another issue when using AIC for mixed-effects models. There have been some adjusted forms of AIC for mixed-effects mod- els, such as marginal AIC and conditional AIC, but these developments are considered only for linear mixed-effects models. Florin & Blanchard (2005) proposed a conditionalAICto compare linear mixed-effects mod- els with different random-effects structures, which can be viewed as a finite-size correction for AIC. Greven & Kneib (2010) did a simulation study for both marginalAICand conditionalAICfor linear mixed effects models.
70 Chapter 3. Logistic regression models with application to anglerfish
More importantly, model selection for mixed-effects models should be considered in the light of the final inferences or prediction. If further in- ferences, such as prediction, are only of interest at the population-level, then marginalAICis suggested; if at a particular group or cluster, then the conditionalAIC is recommended (see Greven & Kneib (2010) for more discussion). More specifically, if the statistical inferences are based on the mode of the random-effects coefficientbi for theith group, then the degrees of freedom of random effects should be counted as the total num- ber of groups, and model selection should be based on the conditional AIC. On the other hand, if the statistical inferences are based on the esti- mated distribution ofbi, equivalentlyΣbb, then the degrees of freedom of
the random-effects parameters should be counted as the number of pa- rameters inΣbb, and the model selection should be based on the marginal
AIC.
Consider the anglerfish application, incorporating haul as a random effect with 36levels (i.e.,36individual hauls in the data), and fitting a random-intercept logistic regression model. Then the haul effect is taken into account at a cost of one parameter (variance of the haul-specific random intercept). However, for fixed-effects models, the cost of incorporating the haul effect is35parameters. This shows that if we compare the random-effects model with fixed-effects models, it is not appropriate to count the degrees of freedom for the random- effects variance parameter as 1, as it would probably give too small a degree of freedom.
• In step 4: for the final selected model from step 3, it is sensible to check this model in an absolute sense, e.g. using the goodness-of-fit test, checking the normality assumption of random effects, and looking for over-dispersion. Graphical tools can be very useful for understanding the fitted model. The normality assumption of the random-effects distribution can be checked by plotting the conditional modes of the random effects, which can be thought of as the MLE of b obtained in the iterative process of the P-IRLS algorithm (i.e., the˜b(β,Σb)described in Section 3.3.2). In most cases, when the number of groups or clusters is large, the normally assumption of the random effects is usually reasonable. However, when the number of groups is small, the nor- mality assumption could be problematic. This assumption can be loosened by
3.3 Mixed-effects logistic regression model 71 more complicated models, e.g. a mixture of normal distributions for the ran- dom effects, see Kom´arek & Lesaffre (2008) for example.
The Hosmer-Lemeshow goodness-of-fit test described in Section 3.1.5 can be extended to the mixed-effects logistic regression models, withpˆin step 1 calculated as ˆ pij = Z · · · Z Rm pij|bifˆb(bi)dbi, (3.65) wherepij|bifˆb(bi)is the conditional success probability which has been given in (3.47). Based on thesepˆij, the expected countsEi δ in (3.30) are then cal- culated for mixed-effects logistic regression.