• No se han encontrado resultados

FORMATO DE RESUMEN DE TESIS DE PREGRADO

PLANTEAMIENTO DEL PROBLEMA

the random variables ˙Yij, i= 1, . . . , n, j = 1, . . . , J are latent variables that are modeled multiplicatively as ˙ Yij =mi ˙ θi,βj Eij (2.1)

where the function mi for theith case is defined as mi(θ,β) = exp

θ+x0iβ

. (2.2)

Given values of ˙θi, βj and xi, mi

˙ θi,βj

is the conditional expectation of ˙Yij which expresses dependence on predictor variables Xk,k= 1, . . . , p, through the log link,Eij is the error term associated with ˙Yij, ˙θi is an unknown (nuisance) parameter that will need to be estimated, βj is a (p+ 1)-vector of coefficients that also needs to be estimated, xi is a (p+ 1)-vector with xi0 = 1 since the first element corresponds to the intercept and

the remaining elements correspond to the observations obtained by the ith case in the sample onX1, . . . , Xp. Note that the intercept is introduced in the model since we are not assuming that the explanatory variables X1, . . . , Xp can take some special zero value. The error vectorsEi= (Ei1, . . . , EiJ)

0

are assumed to be independent of one another, with

E(Ei) =1. (2.3)

and will be taken to be of the form

˙

Σ=φΩ12WΩ 1

2, (2.4)

where φ is a common dispersion parameter, Ω= diag (ω1, . . . , ωJ) with ω1, . . . , ωJ being relative dispersion parameters attributed to the J random variables Ei1, . . . , EiJ and W

is the correlation matrix.

Letting αjj0 denote the elements in the J ×J matrix W and using equations (2.1) and

(2.4), it follows that EY˙ij =mi ˙ θi,βj , VarY˙ij =φωj h mi ˙ θi,βj i2 CovY˙ij,Y˙ij0 =φ√ωjpωj0mi ˙ θi,βj mi ˙ θi,βj0 αjj0. (2.5)

It may be noticed that the MRM is specified in terms of the variables ˙Yi1, . . . ,Y˙iJ. At this stage this might seem confusing, since our aim is that of modeling compositional variablesYi1, . . . , YiJ. However, recall that compositional variables are obtained as a result of perfoming the closure operation (1.1) on ˙Yi1, . . . ,Y˙iJ. The variables ˙Yi1, . . . ,Y˙iJ will thus be considered as latent variables through which we obtain the observed compositional variables Yi1, . . . , YiJ. To get a better insight into why the model for the latent variables may be used to model the compositional response variables, consider the following. Let the latent variable ˙Yij and its compositional counterpart Yij be related through the equation

˙

Yij =ciYij. (2.6)

On using the closure operation (1.1), it may be noted that ci is some unknown positive constant defined by

ci = ˙Yi1+. . .+ ˙YiJ. (2.7) Since the value of ci is related solely to case i, if its value were to change to say c∗i, the only change in the MRM (2.1) would be in the values of the parameters θ˙1, . . . ,θ˙n

, which are nuisance parameters. They have been introduced in model (2.1) to cater for the rescaling of ˙Y1j, . . . ,Y˙nj. So the fact that a change in ci leads to a change in the values ofθ˙1, . . . ,θ˙n

should not be considered a problem. The parameters of interest are (β1, . . . ,βJ) and by changing the value of ci, the values of (β1, . . . ,βJ) are not affected. An advantage of the just mentioned, is that the value of the constantsc1, . . . , cn may be taken to be any positive value of choice, including the value of unity. On taking all cis to be equal to one, the latent variables ˙Yij will be equal to the compositional variables Yij, so in practice, the MRM (2.1) may be used to model the compositional variables Yij directly.

and the unconstrained ˙Yij may be viewed as analogous to the ‘Poisson trick’ (Palmgren, 1981; Kosmidis and Firth, 2011) which gives the interchangeability of the Poisson distri- bution and the multinomial distribution for loglinear models and multinomial logit models respectively, since the multinomial distribution is invoked by conditioning on the observed marginal totals for the predictors of a Poisson sampling model for a contingency table. In the approach proposed here for compositional data, the compositional response variables arise out of the closure operation on the latent variables and despite the fact that no de- tailed distributional specifications will be made in this novel approach, the analogy with a multinomial logit model may be appreciated through considering equation (2.8) which follows shortly.

The interchangeability betweenYijs and ˙Yijs is also obtained in the estimation procedure. In order to estimate the model parameters in (2.1), focus will be directed towards modeling the mean of ˙Yij through the latent multiplicative regression model defined in equations (2.1) and (2.5). More details on this will be given in Section 2.4.3. Some issues related with identification of the model parameters need to be discussed before delving into how to estimate the model parameters.

2.3

Identification of the Parameters of the Latent MRM

Documento similar