• No se han encontrado resultados

1. CARACTERIZACIÓN DEL SECTOR AGUA POTABLE Y SANEAMIENTO

1.2 ASPECTOS SOCIO-ECONÓMICO

1.2.1 Panorama económico

Tests of hypotheses: variance parameters

Inference concerning variance parameters of a linear mixed effects model usu- ally relies on approximate distributions for the (RE)ML estimates derived from asymptotic results.

It can be shown that the approximate variance matrix for theREMLestimates is given by the inverse of the expected information matrix (Cox and Hinkley, 1974, section 4.8). Since this matrix is not available inASRemlwe replace the expected information matrix by the AI matrix. Furthermore theREMLestimates are con- sistent and asymptotically normal, though in small samples this approximation appears to be unreliable (see later).

A general method for comparing the fit of nested models fitted by REML is the

REML likelihood ratio test, or REMLRT. The REMLRT is only valid if the fixed effects are the same for both models. In ASRemlthis requires not only the same fixed effects model, but also the same parameterisation.

If`R2 is theREMLlog-likelihood of the more general model and`R1 is theREML

log-likelihood of the restricted model (that is, theREMLlog-likelihood under the null hypothesis), then the REMLRTis given by

D= 2 log(`R2/`R1) = 2 [log(`R2)log(`R1)] (2.14)

which is strictly positive. If ri is the number of parameters estimated in model

i, then the asymptotic distribution of the REMLRT, under the restricted model is χ2

r2−r1.

TheREMLRTis implicitly two-sided, and must be adjusted when the test involves an hypothesis with the parameter on the boundary of the parameter space. It can be shown that for a single variance component, the theoretical asymptotic distribution of the REMLRTis a mixture of χ2 variates, where the mixing prob-

abilities are 0.5, one with 0 degrees of freedom (spike at 0) and the other with 1 degree of freedom. The approximate P-value for the REMLRT statistic (D), is 0.5(1-Pr(χ2

1 d)) where d is the observed value of D. This has a 5% crit-

ical value of 2.71 in contrast to the 3.84 critical value for a χ2 variate with 1 degree of freedom. The distribution of the REMLRT for the test that k variance Revised 08

components are zero, or tests involved in random regressions, which involve both variance and covariance components, involves a mixture ofχ2 variates from 0 to

2 Some theory 18

Tests concerning variance components in generally balanced designs, such as the balanced one-way classification, can be derived from the usual analysis of vari- ance. It can be shown that theREMLRT for a variance component being zero is a monotone function of the F statistic for the associated term.

To compare two (or more) non-nested models we can evaluate the Akaike Infor- mation Criteria(AIC) or theBayesian Information Criteria(BIC) for each model. These are given by

AIC = 2`Ri+ 2ti

BIC = 2`Ri+tilogν (2.15)

where ti is the number of variance parameters in model i and ν =n−p is the residual degrees of freedom. AICand BIC are calculated for each model and the model with the smallest value is chosen as the preferred model.

Diagnostics

In this section we will briefly review some of the diagnostics that have been im- plemented inASRemlfor examining the adequacy of the assumed variance matrix for either R orG structures, or for examining the distributional assumptions re- garding e or u. Firstly we note that the BLUP of the residual vector is given by ˜ e = y−Wβ˜ = RP y (2.16) It follows that E (˜e) = 0 var (˜e) = R−W C−1W0

The matrix θW C−1W0 is the so-called ‘extended hat’ matrix. It is the linear mixed effects model analogue ofσ2X(X0X)1X0for ordinary linear models. The

diagonal elements are returned in the fourth field of the .yhtfile.

The !OUTLIERqualifier invokes a partial implementation of research by Alison Outliers

ASReml3 Smith, Ari Verbyla and Brian Cullis. With this qualifier, ASRemlwrites

G−1uand G−1u/diag p

G−1−G−1CZZG−1 to the.sln file,

R−1eand R−1e/diag p

2 Some theory 19

and copies lines where the last ratio exceeds 3 in magnitude to the.resfile and reports the number of such lines to the .asr file.

It is not debugged for multivariate models or XFA models with zero Ψs.

The variogram has been suggested as a useful diagnostic for assisting with the Variogram

identification of appropriate variance models for spatial data (Cressie, 1991). Gilmour et al. (1997) demonstrate its usefulness for the identification of the sources of variation in the analysis of field experiments. If the elements of the data vector (and hence the residual vector) are indexed by a vector of spatial coordinates, si, i = 1, . . . , n, then the ordinates of the sample variogram are given by

vij = 12ei(si)˜ej(sj)]2, i, j= 1, . . . , n; i6=j

The sample variogram reported byASRemlhas two forms depending on whether ASReml2

the spatial coordinates represent a complete rectangular lattice (as typical of a field trial) or not. In the lattice case, the sample variogram is calculated from the triple (lij1, lij2, vij) wherelij1 =si1−sj1andlij2 =si2−sj2are the displacements.

As there will be many vij with the same displacements, ASReml calculates the means for each displacement pair lij1, lij2 either ignoring the signs (default) or

separately for same sign and opposite sign (!TWOWAY), after grouping the larger displacements: 9-10, 11-14, 15-20, .... The result is displayed as a perspective plot (see page 238) of the one or two surfaces indexed by absolute displacement group. In this case, the two directions may be on different scales.

Otherwise ASReml forms a variogram based on polar coordinates. It calculates the distance between pointsdij =

q l2

ij1+lij22and an angleθij (180< θij <180)

subtended by the line from (0,0) to (lij1, lij2) with the x-axis. The angle can be

calculated as θij = tan1(lij1/lij2) choosing (0 < θij < 180) if lij2 > 0 and

(180 < θij < 0) if lij2 < 0. Note that the variogram has angular symmetry

in that vij = vji, dij = dji and ij −θji| = 180. The variogram presented averages the vij within 12 distance classes and 4, 6 or 8 sectors (selected using a !VGSECTORS qualifier) centred on an angle of (i−1)180/s (i = 1, ...s). A figure is produced which reports the trends in ¯vij with increasing distance for each sector.

ASReml also computes the variogram from predictors of random effects which appear to have a variance structures defined in terms of distance. The variogram details are reported in the .res file.

2 Some theory 20