• No se han encontrado resultados

Etiología de los planteamientos acerca del constructo

I. MARCO TEÓRICO

1. Las personas mayores en Chile: situación, avances y desafíos

2.4. Etiología de los planteamientos acerca del constructo

In Chapter 2, we consider a two-phase ODS design in a cohort study. A two-phase ODS sample consists of complete observations under ODS scheme in the second phase and

Figure 1.3: Illustration for a two-phase ODS under a linear regression model

SRS ODS ODS population !" !# SRS ODS ODS $ ≤ !" !"< $ < !# $ ≥ !# Incomplete : Complete :

Figure 1.4: Conceptual illustration for a general two-phase ODS design

First stage : {Yi, Wi∶i=1,�, N};

Second stage : SRS {Yi, Xi, Wi∶i=1,�, n0};

ODS from the left tail {Yi, Xi, Wi�Yi≤c1∶i=1,�, n1};

ODS from the right tail {Yi, Xi, Wi�Yi≥c3∶i=1,�, n3}.

observations in the first phase. To fix notation letY denote a continuous outcome variable,Xbe a covariate vector, and W be a proxy measure for X. Figure 1.3 shows the two-phase ODS design in a cohort study under a linear model. In terms of the type of auxiliary information, our proposed method considers a continuous auxiliary variable,W, for a covariate of interest while Weaver and Zhou (2005) considered a categorical auxiliary variable in their discussion section. We assume that there are independent and identically distributed population samples of sizeN in the first phase. The domain of Y consists of 3 mutually exclusive intervals : C1∪C2∪C3 =(−∞, c1]∪(c1, c3]∪(c3,∞)wherec1 andc3 are fixed constants. In the second

phase, the ODS sample of sizenconsists of three parts, SRS sample of sizen0, a supplemental

ODS sample of size n1 from C1 and another supplemental ODS sample of size n3 from C3.

Thus, a two-stage ODS design in our study has the data structure as follows : The ODS sample in the second phase is a complete sample but that the rest of the observations in the population are incomplete observations that have missing in covariate. From the measurement error terminology, V denotes the validation sample set and V denotes the nonvalidation sample set.

Let nV be the total sample size of ODS that consists of complete observations, and

nV = N −nV, is the number of incomplete observations. nV = n0 +n1 +n3 where n0 is the

number of SRS sample andnk denotes the number of supplemental ODS samples from thekth

interval. Figure 1.4 is depicted to give a graphical understanding of two-phase ODS design. The ellipses parts representV of sizenV, and the shaded area represents V of sizenv, respectively.

We incorporate two methods : (1) a semiparametric empirical likelihood method for complete observations; (2) an updating method in Chen & Chen (2000) and Jiang & Zhou (2007) to update estimates from the ODS sample. With complete ODS observations from the second phase, We consider two regression models, a regression model that represents a relationship between the response and covariates of interest and one about a relationship between the response and auxiliary variable. Without loss of generality, we consider a regression model for a covariate of interest and continuous response variable,

Y =X +ex, (1.8)

where ’s denote regression parameters and ex ∼ N(0, x2). On the other hand, a regression

model for the auxiliary variable,

Y =W +ew, (1.9)

where ’s denote regression parameters andew ∼N(0, w2).By applying the likelihood in Zhou

et al. (2002) to two regression models with respect to =( 0, 1)′and =( 0, 1)′, respectively,

we have two likelihoods for complete observations in the second stage :

For the linear model in (1.8) with ODS samples that have the data structure of {Yi, Xi}, i =

1,�, nv,

= ��n0 i=1 f (y0i�x0i)gX(x0i)�×� � k=1,3 nk � i=1 P(yki, xki�Yi∈Ck)�,

whereGX andgX denote the cumulative distribution and density function ofX. For the linear

model in (1.9) with ODS samples that have the data structure of{Yi, Wi},i=1,�, nv,

L( , HW) = LSRS( , HW)⋅LODS( , HW) = ��n0 i=1 f (y0i�w0i)hW(w0i)�×� � k=1,3 nk � i=1 P(yki, wki�Yi∈Ck)�,

where HW and hW denote the cumulative distribution and density function of W. By the

semiparametric empirical likelihood method, we obtain (ˆ,ˆ) for true value of (,) with

some constraints that will be given in Chapter 3. The multivariate normal distribution theory provides the asymptotic distribution of√nv(ˆ− ,ˆ− ). Since we assumed that all values of

auxiliary variable and response in the study population, a regression model for the population dataset is given as

Y =W +e,

where ’s denote regression parameters ande∼N(0, 2).The estimate of is obtained by using

maximum likelihood under SRS scheme for the population sample. We will study how to update ˆ by using the updating algorithm in Chen & Chen (2000) and Jiang & Zhou (2007) under two-phase ODS design. This approach has advantages of using more information in two-phase sampling and more efficient estimators than those in Weaver and Zhou (2005) and computational ease for multiple covariates and auxiliary variables.

Figure 1.5: Illustration for ODS with missing data under a linear regression model SRS ODS ODS population !" !# $ ≤ !" c"< $ < !# $ ≥ !#

Figure 1.6: Conceptual illustration for the general ODS with missing data

1.5.2 An Estimated likelihood approach to a missing data under an Outcome-dependent