• No se han encontrado resultados

Electrodos selectivos a iones de configuración convencional

B) Procedimiento experimental

forms what may be called the weak likelihood principle. Less immediately, if two different models but with parameters having the same interpretation, but possibly even referring to different observational systems, lead to data y and y with proportional likelihoods, then again the posterior distributions are identical. This forms the less compelling strong likelihood principle. Most fre-quentist methods do not obey this latter principle, although the departure is usually relatively minor. If the models refer to different random systems, the implicit prior knowledge may, in any case, be different.

4.3 Frequentist analysis

4.3.1 Extended Fisherian reduction

One approach to simple problems is essentially that of Section2.5and can be summarized, as before, in the Fisherian reduction:

• find the likelihood function;

reduce to a sufficient statistic S of the same dimension asθ;

find a function of S that has a distribution depending only onψ;

place it in pivotal form or alternatively use it to derive p-values for null hypotheses;

• invert to obtain limits forψ at an arbitrary set of probability levels.

There is sometimes an extension of the method that works when the model is of the(k, d) curved exponential family form. Then the sufficient statistic is of dimension k greater than d, the dimension of the parameter space. We then proceed as follows:

if possible, rewrite the k-dimensional sufficient statistic, when k> d, in the form(S, A) such that S is of dimension d and A has a distribution not depending onθ;

consider the distribution of S given A= a and proceed as before. The statistic A is called ancillary.

There are limitations to these methods. In particular a suitable A may not exist, and then one is driven to asymptotic, i.e., approximate, arguments for problems of reasonable complexity and sometimes even for simple problems.

We give some examples, the first of which is not of exponential family form.

Example 4.1. Uniform distribution of known range. Suppose that(Y1,. . . , Yn) are independently and identically distributed in the uniform distribution over (θ −1, θ +1). The likelihood takes the constant value 2−nprovided the smallest

and largest values(y(1), y(n)) lie within the range (θ − 1, θ + 1) and is zero otherwise. The minimal sufficient statistic is of dimension 2, even though the parameter is only of dimension 1. The model is a special case of a location family and it follows from the invariance properties of such models that A= Y(n)−Y(1)

has a distribution independent ofθ.

This example shows the imperative of explicit or implicit conditioning on the observed value a of A in quite compelling form. If a is approximately 2, only values ofθ very close to y= (y(1)+ y(n))/2 are consistent with the data.

If, on the other hand, a is very small, all values in the range of the common observed value yplus and minus 1 are consistent with the data. In general, the conditional distribution of Ygiven A= a is found as follows.

The joint density of(Y(1), Y(n)) is

n(n − 1)(y(n)− y(1))n−2/2n (4.3) and the transformation to new variables(Y, A= Y(n)−Y(1)) has unit Jacobian.

Therefore the new variables(Y, A) have density n(n−1)a(n−2)/2ndefined over the triangular region(0 ≤ a ≤ 2; θ −1+a/2 ≤ y≤ θ +1−a/2) and density zero elsewhere. This implies that the conditional density of Ygiven A= a is uniform over the allowable intervalθ − 1 + a/2 ≤ y≤ θ + 1 − a/2.

Conditional confidence interval statements can now be constructed although they add little to the statement just made, in effect that every value ofθ in the relevant interval is in some sense equally consistent with the data. The key point is that an interval statement assessed by its unconditional distribution could be formed that would give the correct marginal frequency of coverage but that would hide the fact that for some samples very precise statements are possible whereas for others only low precision is achievable.

Example 4.2. Two measuring instruments. A closely related point is made by the following idealized example. Suppose that a single observation Y is made on a normally distributed random variable of unknown meanµ. There are available two measuring instruments, one with known small variance, say σ02= 10−4, and one with known large variance, sayσ12= 104. A randomizing device chooses an instrument with probability 1/2 for each possibility and the full data consist of the observation y and an identifier d = 0, 1 to show which variance is involved. The log likelihood is

− log σd− exp{−(y − µ)2/(2σd2)}, (4.4) so that(y, d) forms the sufficient statistic and d is ancillary, suggesting again that the formation of confidence intervals or the evaluation of a p-value should use the variance belonging to the apparatus actually used. If the sensitive apparatus,

4.3 Frequentist analysis 49

d= 0, is in fact used, why should the interpretation be affected by the possibility that one might have used some other instrument?

There is a distinction between this and the previous example in that in the former the conditioning arose out of the mathematical structure of the problem, whereas in the present example the ancillary statistic arises from a physical distinction between two measuring devices.

There is a further important point suggested by this example. The fact that the randomizing probability is assumed known does not seem material. The argu-ment for conditioning is equally compelling if the choice between the two sets of apparatus is made at random with some unknown probability, provided only that the value of the probability, and of course the outcome of the randomization, is unconnected withµ.

More formally, suppose that we have factorization in which the distribution of S given A= a depends only on θ, whereas the distribution of A depends on an additional parameterγ such that the parameter space becomes θ × γ, so thatθ and γ are variation-independent in the sense introduced in Section 1.1. Then A is ancillary in the extended sense for inference aboutθ. The term S-ancillarity is often used for this idea.

Example 4.3. Linear model. The previous examples are in a sense a prelim-inary to the following widely occurring situation. Consider the linear model of Examples1.4and2.2in which the n× 1 vector Y has expectation E(Y) = zβ, where z is an n× q matrix of full rank q < n and where the components are independently normally distributed with varianceσ2. Suppose that, instead of the assumption of the previous discussion that z is a matrix of fixed constants, we suppose that z is the realized value of a random matrix Z with a known probability distribution.

The log likelihood is by (2.17)

− n log σ − {(y − z ˆβ)T(y − z ˆβ) + ( ˆβ − β)T(zTz)( ˆβ − β)}/(2σ2) (4.5) plus in general functions of z arising from the known distribution of Z. Thus the minimal sufficient statistic includes the residual sum of squares and the least squares estimates as before but also functions of z, in particular zTz. Thus conditioning on z, or especially on zTz, is indicated. This matrix, which specifies the precision of the least squares estimates, plays the role of the distinction between the two measuring instruments in the preceding example.

As noted in the previous discussion and using the extended definition of ancillarity, the argument for conditioning is unaltered if Z has a prob-ability distribution fZ(z; γ ), where γ and the parameter of interest are variation-independent.

In many experimental studies the explanatory variables would be chosen by the investigator systematically and treating z as a fixed matrix is totally appropriate. In observational studies in which study individuals are chosen in a random way all variables are random and so modelling z as a random variable might seem natural. The discussion of extended ancillarity shows when that is unnecessary and the standard practice of treating the explanatory variables as fixed is then justified. This does not mean that the distribution of the explanatory variables is totally uninformative about what is taken as the primary focus of concern, namely the dependence of Y on z. In addition to specifying via the matrix(zTz)−1the precision of the least squares estimates, comparison of the distribution of the components of z with their distribution in some target population may provide evidence of the reliability of the sampling procedure used and of the security of any extrapolation of the conclusions. In a comparable Bayesian analysis the corresponding assumption would be the stronger one that the prior densities ofθ and γ are independent.

Documento similar