3.2 CARACTERÍSTICAS DE LOS MATERIALES PARA H.A.R
3.2.1 CEMENTO PORTLAND
Even if Fisher’s null hypothesis H0 were true, birth outcomes might be different
in Philadelphia and elsewhere because mothers in Philadelphia differ from mothers elsewhere. This may be expressed in terms of a model that relates delivery in Philadelphia to characteristics of mothers and their neighborhoods inF. This model begins by describing the situation prior to matching. The model says that prior to matching, the Zik were conditionally independent given Fwith
Pr (Zik = 1| F) =
exp{κ(xik) +γuik+%rCik}
1 + exp{κ(xik) +γuik+%rCik}
, 0≤uik ≤1 (3.4.1)
where κ(·) is an unknown function. In (3.4.1), by Bayes theorem, the term κ(xik)
permits the distribution of observed covariatesxik in Philadelphia to differ from the
distribution among potential controls before matching, as indeed is seen to be the case in Table 3.1; moreover, because year is in xik, (3.4.1) permits this difference in
In (3.4.1), if % 6= 0 then the response rCik the mother or baby would exhibit
outside Philadelphia is related to whether the mother delivers in Philadelphia; that is, by Bayes theorem under (3.4.1), birth injuries may be more or less common in Philadelphia than elsewhere. A bias of the form % 6= 0 would be the worst type of bias if one were comparing Philadelphia to matched control, but the study compares Philadelphia in two time periods to control in two time periods, and for this comparison%6= 0 is less of a problem. Of course, we cannot estimate %because we observe Rik not rCik; in particular, we never observe rCik when Zik = 1, so we
could not fit (3.4.1) even if we somehow knew that γ = 0.
If γ 6= 0 in (3.4.1), then the unobserved (and hence unmatched) covariate uik is
related to whether a mother delivers in Philadelphia. Because 0≤uik ≤1 in (3.4.1),
two mothers ik and ik0 with (xik, rCik) = (xik0, rCik0) may differ in their odds of
delivering in Philadelphia by a factor of at most Γ = exp (γ) because uik and uik0
differ. Because uij is otherwise unconstrained, it may be different in Philadelphia
and control in a different way before and after hospital closures. The term γuik
with 0 ≤uik ≤1 introduces a bias of entirely unspecified form but of a magnitude
determined by the magnitude of the sensitivity parameter Γ.
To aid interpretation, it is sometimes convenient to unpack the single parameter Γ into two parameters (∆,Λ) as Γ = (1 + ∆Λ)/(∆ + Λ) where Λ controls the relationship betweenui1−ui2 andZi1−Zi2 and ∆ controls the relationship between
ui1−ui2 andrCi1−rCi2. Here,YCi = (Zi1−Zi2) (rCi1−rCi2) is 1 if the Philadelphia
baby would have had a birth injury if delivery had occurred outside Philadelphia but the control would not,Yi =−1 if the situation were reversed, andYi = 0 if both
babies would have had the same outcome outside Philadelphia. If % = 0 so that McNemar’s test may be used in a sensitivity analysis comparing Philadelphia babies to controls, a value of Γ = 1.25 unpacks into the curve 1.25 = (1 + ∆Λ)/(∆ + Λ), which includes, for example, (∆,Λ) = (2,2) for a uik that doubles the odds of
delivering in Philadelphia and doubles the odds of a birth injury, but it also includes (∆,Λ) = (1.4,5) and (∆,Λ) = (5,1.4). Analogously, Γ = 2 unpacks into (∆,Λ) = (3,5) and (∆,Λ) = (5,3) and other values on the curve Γ = (1 + ∆Λ)/(∆ + Λ). For discussion of various aspects of this interpretation of the magnitude of Γ, see Gastwirth, Krieger and Rosenbaum (1998, Section 2) and Rosenbaum and Silber (2009a).
Our analysis eliminates%in (3.4.1) as a nuisance parameter; see Proposition 3.4.1. In one sense the value of%does matter because it affects the patterns of data we see, but in another sense it does not matter because no matter what value%takes on, the difference-in-differences analysis will fully account for it. Because of this and because (3.4.1) is linear in uik and rCik on the logit scale, we may assume without loss of
generality that the unobserved covariate, uik, is uncorrelated with birth injuries in
the absence of closures, rCik, because if this were not the case, we could replace uik
by its least squares residual ˘uik =uik−(ϑ+ηrCik), so ˘uik and rCik are uncorrelated,
and κ(xik) +γuik +%rCik in (3.4.1) equals {κ(xik) +ϑ}+γu˘ik + (%+η)rCik. In
other words, an unobserved covariateuik cannot bias the analysis by virtue of being
related to birth injuries; it must instead in Factor 1 be related to birth injuries in a different way in different years, or in Factor 2 it must be related to birth injuries in a different way in different zip codes. Although this appears to be an attractive
feature of the difference-in-differences analysis, there is a nontrivial price to be paid for it. If %were known to be zero, then Philadelphia and control-Philadelphia could be compared directly, say using McNemar’s test for binary responses in matched pairs, and the bias from uik would be of magnitude γ on the logit scale or Γ =
exp (γ) in terms of odds; see Rosenbaum (2002, Section 4.3.2). In contrast, although the difference-in-differences analysis may take uik to be uncorrelated with rCik, the
analysis faces a bias fromuik of magnitude 2γon the logit scale or Θ = Γ2 = exp (2γ)
in terms of odds; again, see Proposition 3.4.1. In brief, the difference-in-difference analysis is completely unaffected by certain unmeasured biases perfectly correlated with rCik, but is twice as sensitive to certain other unmeasured biases uncorrelated
withrCik. A mathematically distinct yet conceptually related phenomenon has been
noted previously, with difference-in-differences studies being more severely affected by errors-of-measurement (Freeman 1984, Griliches and Hausman1986).
After matching for xik, so that xi1 = xi2 and Zi1 +Zi2 = 1, the model (3.4.1)
implies
Pr (Zi1 = 1| F, Zi1+Zi2 = 1) =
exp (γui1+%rCi1)
exp (γui1+%rCi1) + exp (γui2+%rCi2)
. (3.4.2)
In particular, (3.4.2) is 12 ifγ =%= 0, but otherwise treatment assignment is biased. An alternative but nearly equivalent formulation of the model would omit reference to the population prior to matching — that is, omit reference to (3.4.1) — and take (3.4.2) as the starting point, that is, take (3.4.2) as a model for treatment assignment Zik within a given matched pair i. Our sense is that the step from
(3.4.1) to (3.4.2) is useful in making it clear what matching for xik does and what
it fails to do. There is, however, one advantage in beginning with (3.4.2). Once a matched pair is formed, there is one Philadelphia zip code attached to that pair, and by including that zip code inF as an attribute of the pairi(not the mother k), we may understand (3.4.2) as a model for the identity k of the Philadelphia mother in pair i. That is, in this formulation, (3.4.2) asks: Given that pair i contains two mothers, one from Philadelphia zip-code xxxxx and the other from a zip code with similar attributes elsewhere in Pennsylvania, California or Missouri, and given specific values of (ui1, rCi1) and (ui2, rCi2) for these two mothers, what is the chance
that mother i1 is the Philadelphia mother and i2 is the mother from elsewhere? This distinction between starting with (3.4.1) and starting with (3.4.2) is relevant only to comparisons of pairs with a zip code near a hospital closure versus pairs with a zip code remote from closures — in such comparisons, zip code is treated as a fixed attribute of the pair, as year is treated as a fixed attribute of the pair in temporal comparisons.