• No se han encontrado resultados

CAPÍTULO I: ¡Error! Marcador no definido.

1.2. Evolución histórico tendencial del objeto de estudio

We formally introduce a semiparametric regression model for m-rep data fromnsubjects. Suppose we have an exogenous q×1 covariate vector xi for the i−th subject and m-rep measures, denoted byMi ={mi(d) :d ∈ D}, for thei−th subject, wheredrepresents an atom of an m-rep onD, a specific brain region. For notational simplicity, we temporarily drop atom d from our notation.

The regression model involves modeling a ‘conditional mean’ of an m-rep response at an atom mi given xi, denoted by µi(β) = µ(xi,β), where β is a p×1 vector of regression coefficients in B ⊂ Rp. Thus, µ(·,·) is a map from Rq ×Rp to M(1) and

µi(β) = (µOi(β), µri(β),µ0i(β),µ1i(β)) 0

, which is a 10×1 vector and µOi(β), µri(β),

two spoke directions n0i and n1i respectively, given xi, for the i-th subject. Note that we just borrow the term ‘conditional mean’ from Euclidean space.

We need to formalize this notion of ‘conditional mean’ explicitly. For the location component of an m-rep, we may set µOi(β) = (g(xi,β1), g(xi,β2), g(xi,β3))

0

, where

βk (k = 1,2,3) is apk×1 coefficient vector. There are many different ways of specifying g(xi,βk). The simplest one is the linear link function g(xi,βk) = x

0

iβk. We may also represent g(xi,βk) as a linear combination of basis functions {ψj(xi) : j = 1,· · · , J} (such as B-splines), that is g(xi,βk) =

PJ

j=1ψj(xi)βkj. For the radius component, we may use µri(β) =g(xi,β4), where β4 is a p4×1 coefficient vector for an m-rep radius.

Since a radius is always positive, a natural link function isg(xi,βk) = exp(x 0

iβk), among other possible choices.

As for the two directions on an m-rep, they are more complex and will be our focus here. In the existing literature, the circular regression models (Gould, 1969; Johnson and Wehrly, 1978; Fisher and Lee, 1992) assume that the angular representation of a direction follows the von Mises distribution with either the angular mean ηi or the concentration parameter κi associated with xi. Gould’s (1969) regression model for a circular response takes the form ofηi =η+x

0

iβ, whereηi is the angular representation of the circular response for thei−th subject and (η,β) are unknown parameters. A major critism of Gould’s model is its identifiability problem, that is, the likelihood function has infinitely many maxima of comparable size (Fisher and Lee, 1992; Presnell, Morrison, and Littell, 1998). To avoid this problem, Fisher and Lee (1992) replaced the linear link function by a suitable one-to-one function g : (−∞,∞) → (−π, π) satisfying g(0) = 0. Two such link functions are the inverse tangent link, g(x) = 2arctan(x), and the scaled probit link, g(x) = 2π[Φ(x) −0.5], where arctan is the inverse of the tangent function. Johnson and Wehrly (1978) used the link function g(x) = 2πF(x), where F is a cumulative distribution function. These link functions can be generalized to spherical data as detailed below.

We now develop the link functions for a spherical response. For notational simplicity, we use n0 as an example throughtout. We need to specify the explicit form of µ0i(β), the ‘conditional mean’ function of n0i for the i-th subject. We can use spherical polar coordinates to representµ0i(β) as µ0i(β) =       cos(φi) sin(φi) cos(ηi) sin(φi) sin(ηi)       , (5.1)

where φi denotes the colatitude (so that π/2−φi is the latitude) and ηi denotes the longitude for the i−th subject. Following Fisher and Lee (1992), we may assume that

φi =x 0

i,dβ1d+ arctan(x 0 iβ1c), ηi =x0i,dβ2d+ 2arctan(x0iβ2c),

(5.2)

where xi,d includes all the discrete covariates and the intercept, and xi,c are all the centered continuous covariates.

So far, we have defined link functions for all the components of an m-rep. Now, we introduce a definition of a ‘residual’ to ensure that µi(β) is the proper ‘conditional mean’ of mi given xi. For instance, in the classical linear model, the response is the sum of the regression function and the residual. Then, the regression function is the conditional mean of the response only when the conditional mean of residual equals zero. Given two points mi and µi(β) on the manifold, we need to define the residual or ‘difference’ between them. Atµi(β), we have the tangent space of the manifold, denoted by Tµ

i(β)M(1), which is a Euclidean space representing a first order approximation of

the manifoldM(1) nearµi(β). Then we calculate the projection ofmiontoTµ

i(β)M(1), denoted by Logµ i(β) (mi), which is given by Logµ i(β) (mi) = (Oi−µOi(β),log(ri/µri(β)),Logµ 0i(β) (n0i),Logµ 1i(β) (n1i)), (5.3)

where Logµ 0i(β)(n0i) = arccos(µ0i(β) 0 n0i)v/||v||, in whichv=n0i−(µ0i(β) 0 n0i)µ0i(β) andk·kis the Euclidean norm. Thus, Logµ

i(β)(mi) can be regarded as the difference be-

tweenmi andµi(β) inTµ

i(β)

M(1). Since Logµ

i(β)

(mi) are in different tangent spaces, we must translate them to the same tangent space. We can use a rotation matrix,Rµ

i(β) , to translate Logµ i(β) (mi) ∈ Tµ i(β)

M(1) into the same tangent space, say TP0M(1),

in which P0 = (0,0,0,1,0,0,1,0,0,1) 0 , and define Ei(β) = Rµ i(β) Logµ i(β) (mi) for i= 1,· · · , n.

Documento similar