Arquitectura teotihuacana
2.1. Urbanismo en Teotihuacan
2.1.2 Algunas características que sugieren la planeación de la ciudad
Here, we justify Rubin’s rules from the Bayesian perspective by presenting the details given in Carpenter and Kenward, p. 46–47 [44]. We have observed data Yoand missing data Ym. We now
suppress η —the parameters of the distribution of the observed data— interest lies in θ, the key parameters of the substantive model of interest. The missing data and parameters of interest have the following joint posterior distribution,
f (Ym, θ|Yo).
Using standard conditional probability rules this can be partitioned as follows,
f (Ym, θ|Yo) = f (Ym|Yo)f (θ|Ym, Yo).
The missing data can now be separated out, and the marginal posterior for our parameters of interest can be expressed as,
f (θ|Yo) = EYm|Yo{f (θ|Ym, Yo)} .
By using the rules of iterated expectations, the posterior mean and variance for θ are then given by,
E(θ|Yo) = EYm|YoEθ(θ|Ym, Yo) , (1.9)
let Ym,k denote the kth draw of the missing data (for k = 1, ...K) from the Bayesian predictive
distribution f (Ym|Yo) then (1.9) can be approximated as follows,
E(θ|Yo) ≈ 1 K K X k=1 Eθ(θ|Ym,k, Yo).
That is by the average of the estimates of θ from each imputed dataset. For scalar θ this gives (1.5). Equation (1.10) can be approximated as,
VAR(θ|Yo) ≈ 1 K K X k=1 VARθ(θ|Ym,k, Yo) + 1 K − 1 K X k=1 Eθ(θ|Ym,k, Yo) − 1 K K X k=1 Eθ(θ|Ym,k, Yo) !2 .
This is the average of the estimates of the variance of θ from each imputed dataset combined with the variance of θ across imputed datasets. For large K these quantities are valid approximations for the mean and variance of the posterior distribution. As discussed in [44], p. 50, for small K there is increased uncertainty in the estimated mean of θ. To account for this we must therefore adjust the second term on the right hand side (RHS) —the between imputation variance— by a factor of K1, which for a scalar θ gives us (1.6).
Despite being Bayesian in nature, provided some subtle conditions hold, Rubin’s combination rules also provide valid frequentist inference, in that they provide an estimator which is asymptotically unbiased and an accompanying estimate of variance which can be used to construct confidence intervals with coverage equal to that specified. Rubin outlines the requirements for this as follows, which we elaborate on below.
1. Draw imputations following the Bayesian paradigm as repetitions from a Bayesian poste- rior distribution of the missing values under the chosen models for non-response and data, or an approximation to this posterior distribution that incorporates appropriate between- imputation variability.
2. Choose models of nonresponse appropriate for the posited response mechanism.
3. Choose models for the data that are appropriate for the complete-data statistics likely to be used - if the model for the data is correct, then the model is appropriate for all complete-data statistics.
[19]
Conditions 1 and 2 essentially refer to what Rubin termed proper imputation [19]. They require sampling from a properly defined posterior distribution under a correct model for the non-response and data. That is the imputation models assumptions about the missing data mechanism are
correct, such that our estimate of the treatment effect and its variance, averaged over each imputed dataset, ˆθM I and ˆW are unbiased for the complete-data treatment effect and estimated variance,
were these to exist. The estimated variance of our treatment effect across the imputations, ˆB, should be approximately unbiased for the sampling variance of ˆθM I over repeated imputations.
We note that conditions 1–3 also imply that the substantive model is correctly specified for valid frequentist inference.
These conditions incorporate the requirement for congeniality between the imputation model and analysis model which, subsequent to the development of MI, was described and explored by Meng [46]. Assume there exists a full Bayesian procedure for obtaining the posterior of θ from the joint data distribution and this is partitioned accordingly and also used to impute the missing data. If the resulting imputation distribution is the same as the predictive distribution obtained by the imputation model then the imputation model and substantive model are said to be congenial. That is, the imputation model is derivable from the joint data distribution. We interpret this as the imputation and analysis model must have the same content and structure and so be formed around the same assumptions to be congenial.
When the substantive model and imputation model do not satisfy this condition, they are described as uncongenial [46]. The validity of Rubin’s variance estimator is not guaranteed when this is the case. Uncongeniality may occur when the imputation model contains more variables or structure than the substantive model. When the imputation model contains a congenial imputation model nested within it, then the imputation model is said to be richer than the substantive model. Alternatively the imputation model may lack variables or structure present in the substantive model, then the imputation model is said to be poorer.
When the imputation model is richer and the additional information built into the imputation model is correct Meng [46] and Rubin [41] show that the ˆθM I will be more efficient. Or as termed
by Rubin “superefficient” [41]. As the imputer has used additional superior knowledge the sampling variance of the MI estimate will be reduced. Since additional predictors are used in the imputation that are not incorporated into the analysis, Rubin’s variance formula over estimates the sampling variability. Confidence intervals will therefore have greater than nominal coverage. Carpenter and Kenward [47] note that this overestimation is typically not large. Thus practically this tends not to be too much of a disadvantage.
However when additional structure included in the imputation model is not correct, this can have more unwelcome consequences. The imputations will be biased, therefore the results of the analysis will be. Biased estimation in analysis will also occur when the imputation model is poorer than the analysis model. For example if an important predictor of the outcome or an important predictor of missingness is not included in the imputation model. As Schafer clearly outlines [43] the main danger from uncongeneiality comes when the imputer makes poorly grounded assumptions, that the analyst does not. It is therefore recommended for the imputation of a variable Y to include all variables that are related to the missingness of Y in the imputation model along with variables related to Y.