Intervenciones en monumentos arqueológicos
4.2 Criterios nacionales e internacionales de conservación arqueológica
Let z = a, r index the randomised active arm or reference arm allocation for each patient i with follow-up outcome at time J denoted by YziJ. The follow-up outcome data at the final time point
for the reference patients are contained in the vector YrJ = (Yr1J, ...YrnJ) T
. The final visit outcome data for the non-deviating active patients are contained in the vector YaJ,o= {YaiJ; i ∈O}
T
.
We suppose that each deviating patient has two potential outcomes at time J : the one that would occur if they remain on active treatment post-deviation (de-jure) and the other that would occur off-treatment post-deviation (de-facto).
The potentially observable de-jure data for the nd deviating patients at time J are contained in
the vector YaJ,DJ,d and the alternative de-facto outcome data in the vector YaJ,DF,d. Define
Y = (YrJ, YaJ,o, YaJ,DJ,d, YaJ,DF,d)T as the collection of observed and potentially observable
outcome data, which has dimensions [(n + no+ 2nd) × 1].
For each deviating patient we can only observe one of the potential outcomes, either de-jure or de- facto. Consider two [(n + no+ 2nd) × (n + no+ 2nd)] matrices, DDJ and DDFof 0’s and 1’s such
that DDJY gives the [(n + no+ 2nd) × 1] de-jure data and DDFY gives the [(n + no+ 2nd) × 1]
de-facto data at time J .
Let a be a [(n + no+ 2nd) × 1] vector such that aTDDJY returns the de-jure treatment estimate
and aTDDFY returns the de-facto treatment estimate. When the deviating patients experience
de-jure behaviour post-deviation and are observed the expectation of the variance of the de-jure on-treatment estimand can be expressed as in (4.1). We consider settings where the expectation of the variance of our de-facto estimand can then be expressed as in (4.2).
We now suppose that post-deviation data are unobserved, i.e. the potentially observable de-facto and de-jure entries in Y are missing for the nd active patients. We alternatively impute these
outcomes, using de-jure imputation and de-facto imputation. This gives K ‘complete’ data samples Yk, of size [(n + no+ 2nd) × 1]. For this we need appropriate imputation distributions for each
missing data pattern under each scenario, with suitable posteriors for the included parameters.
Under our de-jure assumption (on-treatment MAR), the imputation model for patients deviating following time j, for j = 1, ..., J − 1 is formed from the regression of YaJ,o on Pa,o,j where Pa,o,j
is the design matrix for the imputation model, which contains the values of the 1, ..., j outcomes and covariates included in the imputation model (excluding treatment) for the noobserved active
patients, along with a vector of 1’s to include an intercept in the model. This is appropriate since we are not imputing any interim missing outcomes here. We only consider monotone missing data patterns. We are interested in the treatment effect at time J . As described by Carpenter and Kenward (p. 77–78) [44], under MAR, each of the regressions will be validly estimated from those observed in the data set.
The parameter estimates for the de-jure imputation model for the nd,j patients missing outcomes
j + 1 to J are found using ˆβa,o,j = (PTa,o,jPa,o,j)−1PTa,o,jYaJ,o with assumed known covariance
matrix Va,o,j= (PTa,o,jPa,o,j)−1σj2.
denoted ˆβDJ,j, is normal and centered on the ML estimator ˆβa,o,j with covariance matrix Va,o,j.
That is,
ˆ
βDJ,j|YaJ,o∼ N ( ˆβa,o,j; Va,o,j).
The de-jure imputation model for active patient i deviating following time j, for j = 1, ..., J − 1 and imputation k can therefore be expressed as,
˜
YaiJ,k|YaJ,o = Pa,d,j,ih ˆβa,o,j+ ba,o,j,k
i
+ ei,j,k for i ∈ {DJ},
where, ba,o,j,k ∼ N (0, Va,o,j), ei,j,k ∼ N (0, σj2) and Pa,d,j,i contains the values of the 1, ..., j
outcomes and covariates included in the imputation model (excluding treatment, plus a 1 for the intercept) for each deviating active patient i, who deviates following time j.
For de-facto imputation we assume the large sample posterior for the imputation parameters for the nd,j patients missing outcomes j + 1 to J , ˆβDF,j is normal and centered on the ML estimator
ˆ
βDF,o,j with known covariance matrix VDF,o,j, that is for j = 1, ..., J − 1,
ˆ
βDF,j|YDF,J,o∼ N ( ˆβDF,o,j; VDF,o,j),
where YDF,J,oconsists of the relevant observed outcome data under the particular de-facto setting
of interest. The de-facto imputation model for active patient i deviating following time j, for j = 1, ..., J − 1 and imputation k can therefore be expressed as,
˘
YaiJ,k|YDF,J,o = Pa,d,j,ih ˆβDF,o,j+ bDF,o,j,k
i
+ ei,j,k for i ∈ {DJ},
where, bDF,o,j,k ∼ N (0, VDF,o,j) and ei,j,k ∼ N (0, σj2). Under the assumption of equal variance-
covariance matrix of baseline and follow-up by treatment arm we consequently assume the same variance for the residuals in the de-jure and de-facto imputation models for patients deviating following the same time j, for j = 1, ..., J − 1. In Subsection 4.3.5 we consider the impact of relaxing this assumption. We are interested in imputation inference for, K1 PK
k=1a TD DJYk or 1 K PK k=1a TD DFYk.
Letting the number of imputations, K → ∞, the variance of our MI treatment estimate as esti- mated by Rubin’s rules is, VDJ, MI= ˆWDJ+ ˆBDJor VDF, MI= ˆWDF+ ˆBDFwhere under the conditions
Under de-jure, ˆ BDJ= 1 K − 1 K X k=1 (J −1) X j=1
πd,j(¯ej,k− ¯ej) + πd,j P¯a,d,jba,o,j,k− ¯Pa,d,j¯ba,o,j
2 , where ¯ej,k= n1 d,j P i∈DJei,j,k, ¯ej= K1 P K k=1¯ej,k, ¯Pa,d,j= n1 d,j P
i∈DJPa,d,j,i and
¯
ba,o,j =K1 PKk=1ba,o,j,k. Which has expectation,
Eh ˆBDJ i = (J −1) X j=1 πd,j2 " σ2
j+ nd,jP¯a,d,jVa,o,jP¯Ta,d,j
nd,j # . Under de-facto, ˆ BDF= 1 K − 1 K X k=1 (J −1) X j=1
πd,j(¯ej,k− ¯ej) + πd,j P¯a,d,jbDF,o,j,k− ¯Pa,d,jb¯DF,o,j
2 , where ¯bDF,o,j= K1 P K
k=1bDF,o,j,k. Which has expectation,
Eh ˆBDF i = (J −1) X j=1 πd,j2 " σ2
j+ nd,jP¯a,d,jVDF,o,jP¯Ta,d,j
nd,j
# .
The information anchored variance is,
E [Vanchored] = a TD DJΣDTDJa + O(n −2) + Eh ˆB DJ i + Eh ˆBDJ i Eh ˆWDJ i O(n −2).
As in the previous settings, if Rubin’s rules are information anchoring and preserve the information loss in the primary analysis under MAR (4.4) holds. Which in this setting is,
0 ≈
(J −1)
X
j=1
π2
d,jP¯a,d,j(Va,o,j− VDF,o,j) ¯PTa,d,j +
Eh ˆBDJ i Eh ˆWDJ iO(n −2) (4.15)
This gives the required result in the longitudinal trial setting with monotone missingness in one treatment arm.
The result is a natural extension of that observed for the longitudinal setting with only one pattern of non-response (missingness at the final time point J ). The approximation of the information anchored variance by Rubin’s variance estimator is sharpened when, for each missing data pattern, the variance of the parameters in the de-facto imputation model matches the variance of the parameters in the de-jure (MAR) imputation model. However in practice the approximation is generally excellent regardless of an exact match in these quantities. Since this term and the other components in the difference will be of a smaller magnitude, relative to the information anchored variance, for realistic proportions of missing data.
4.3.4
Implementation for improved information anchoring
If the variance of the parameters in the de-jure imputation model corresponds to the variance of the parameters in the de-facto imputation model for each missing data pattern (i.e. Va,o,j= VDF,o,jfor
j = 1, ..., J − 1), Rubin’s variance estimator will approximate the information anchored variance that preserves the loss of information in the primary design based analysis more closely in the longitudinal trial setting with monotone missingness patterns. The first term on the RHS of (4.15) will disappear for j = 1, ..., J − 1.
As described in Subsection 4.1.4 when this is not the case we can alter the reference based procedure to achieve improved information anchoring via bootstrapping the observed reference case sample, then drawing the required samples from the estimated reference distribution to construct the imputed models to achieve Va,o,j = VDF,o,j for j = 1, ..., J − 1. Alternatively we could re-scale the
variance of the parameters in the de-facto imputation model to ensure the variance corresponds with the variance of the parameters in the de-jure imputation model. With a large number of missing data patterns the re-scaling will become less trivial. Thus the bootstrap approach may be more desirable for trials with a larger number of missing data patterns. The choice of the approach undertaken is entirely at the preference of the trialist. However we note that the first term on the RHS of (4.12) is generally very small, relative to the information anchored variance. Thus without this condition we will still see an excellent approximation between Rubin’s variance estimator and the desired information anchored variance.
4.3.5
Relaxing the equal variance assumption
When we relax the equal variance assumption we can no longer assume the variance of the residuals in the de-jure imputation model for patients with missingness pattern j matches the variance of the residuals in the de-facto imputation model for patients with missingness pattern j, for each missing data pattern j.
0 ≈ (J −1) X j=1 π2d,j " σDJ,j2 − σ2 DF,j nd,j
+ ¯Pa,d,j(Va,o,j− VDF,o,j) ¯PTa,d,j
# + Eh ˆBDJ i Eh ˆWDJ iO(n −2) . (4.16)
For each missingness pattern with deviation following time j, an additional component is incorpo- rated. The new components in the difference between Rubin’s variance and the ideal information anchored variance are driven by the degree of difference in the variance structure by trial arm for each missingness pattern. Since the variance structure is not likely to differ too markedly by trial arm for each missingness pattern, and these extra components are each multiplied by πd,j/n, the
overall impact will in practice be relatively small.
So we see that similar to the longitudinal setting where the last measured variable is subject to non-response, relaxing the equal variance assumption does not greatly effect the approximation between Rubin’s variance estimator and the ideal information anchored variance in the longitudinal setting with monotone non-response in one arm.
4.3.6
Extension for deviation in both arms
Suppose among the n reference patients only nr,oare actually observed at all time points without
deviation. Among the remaining nr,d deviating reference patients, we observe nr,d,j patients who
deviate following time j for j = 1, ..., J − 1. For simplicity we assume there is no interim missing data in the reference arm. LetRD and RO define the sets of indices for patients who do and do not deviate in the reference arm respectively. FurtherRDJ denotes the set of indices for deviating reference patients who deviate following time j, so that nr,d=PJ −1j=1nr,d,j. Interest still lies in the
treatment effect at time J .
The outcome data for the observed reference patients at the final time point are contained in the vector YrJ,o = {YriJ; i ∈RO}
T
. The potentially observable de-jure data for the nr,d deviating
reference patients are contained in the vector YrJ,DJ,d and the alternative de-facto outcome data
in the vector YrJ,DF,d. The full collection of observed and potentially observable outcome data is
now defined as Y = (YrJ,o, YrJ,DJ,d, YrJ,DF,d, YaJ,o, YaJ,DJ,d, YaJ,DF,d) T
which has dimensions [(nr,o+ 2nr,d+ no+ 2nd)]. We assume Y is normally distributed and has known variance Σ.
We redefine the two [(nr,o+ 2nr,d+ no+ 2nd) × (nr,o+ 2nr,d+ no+ 2nd)] matrices DDJand DDF
of 0’s and 1’s so that DDJY and DDFY now each give the [(nr,o+ 2nr,d+ no+ 2nd) × 1] de-jure
data or [(nr,o+ 2nr,d+ no+ 2nd) × 1] de-facto data across both treatment arms. We focus on
settings where E [VDF, full] = a
TD
DJΣDTDJa + O(n
−2).
We follow the steps outlined above to establish the de-jure imputation model for active arm patient i, deviating following time j for j = 1, ..., J − 1 and imputation k as,
˜
YaiJ,k|YaJ,o = Pa,d,j,ih ˆβa,o,j+ ba,o,j,k
i
where ba,o,j,k ∼ N (0, Va,o,j), ei,j,k ∼ N (0, σ2j) and Pa,d,j,i contains the values of the 1, ..., j out-
comes and covariates included in the imputation model (excluding treatment, plus a 1 for the intercept) for each deviating active patient i, who deviates following time j. For the reference arm, under our de-jure (on-treatment MAR) assumption the imputation model for patients deviating following time j, for j = 1, ..., J − 1 is formed from the regression of YrJ,oon Pr,o,j where Pr,o,j
is the design matrix for the imputation model, which contains the values of the 1, ..., j outcomes and covariates included in the imputation model with a vector of 1’s to include an intercept term in the model, for the nr,o observed reference patients.
The parameter estimates for the de-jure reference arm imputation model for the nr,d,j patients
missing outcomes j + 1 to J are found using ˆβr,o,j = (PTr,o,jPr,o,j)−1PTr,o,jYrJ,o with assumed
known covariance matrix Vr,o,j = (PTr,o,jPr,o,j)−1σ2j. We assume the large sample posterior for
the parameter estimates for the de-jure reference arm imputation model, denoted ˆβDJ,r,j, is normal
and centered on the ML estimator ˆβr,o,j with covariance matrix Vr,o,j. That is,
ˆ
βDJ,r,j|YrJ,o∼ N ( ˆβr,o,j; Vr,o,j).
The de-jure imputation model for reference patient i deviating following time j, for j = 1, ..., J − 1 and imputation k can therefore be expressed as,
˜
YriJ,k|YrJ,o= Pr,d,j,ih ˆβr,o,j+ br,o,j,k
i
+ ei,j,k for i ∈ {RDJ},
where br,o,j,k ∼ N (0, Vr,o,j), ei,j,k ∼ N (0, σj2) and Pr,d,j,i contains the values of the 1, ..., j out-
comes and covariates included in the imputation model (excluding treatment, plus a 1 for the intercept) for each deviating reference patient i, who deviates following time j. Under de-facto imputation for patients in the active arm deviating following time j for j = 1, ..., J − 1 we assume the large sample posterior for the parameters of the imputation model, which we denote by ˆβDF,a,j, is normal and centered on the ML estimator ˆβDF,a,o,j with known covariance matrix VDF,a,o,j.
That is for j = 1, ..., J − 1,
ˆ
βDF,a,j|YDF,a,J,o∼ N ˆβDF,a,o,j, VDF,a,o,j
,
where YDF,a,J,o consists of the relevant observed outcome data under the particular de-facto
setting of interest. The de-facto imputation model for active patient i deviating following time j for j = 1, ..., J − 1 and imputation k can therefore be expressed as,
˜
where bDF,a,o,j,k∼ N (0, VDF,a,o,j), and ei,j,k∼ N (0, σj2). Under de-facto imputation for patients
in the reference arm deviating following time j for j = 1, ..., J − 1 we assume the large sample posterior for the parameters of the imputation model, which we denote by ˆβDF,r,j, is normal and centered on the ML estimator ˆβDF,r,o,j with known covariance matrix VDF,r,o,j. That is for
j = 1, ..., J − 1,
ˆ
βDF,r,j|YDF,r,J,o∼ N ˆβDF,r,o,j, VDF,r,o,j
,
where YDF,r,J,oconsists of the relevant observed outcome data under the particular de-facto setting
of interest. The de-facto imputation model for reference patient i deviating following time j for j = 1, ..., J − 1 and imputation k can therefore be expressed as,
˜
YriJ,k|YDF,r,J,o= Pr,d,j,ih ˆβDF,r,o,j+ bDF,r,o,j,k
i
+ ei,j,kfor i ∈RDJ,
where bDF,r,o,j,k ∼ N (0, VDF,r,o,j), and ei,j,k∼ N (0, σj2). We are interested in imputation infer-
ence for, 1 K PK k=1a TD DJYk or K1 PK k=1a TD