Electrodos selectivos a iones de configuración convencional
C) Electrodo selectivo al catión plomo(II)
(y(1), y(2), a∗) is (y(2)− y(1))n−2, so that all required distributions are directly derivable from the likelihood function.
A different and in some ways more widely useful role of transformation models is to suggest reduction of a sufficient statistic to focus on a particular parameter of interest.
Example 4.11. Normal mean, variance unknown (ctd). To test the null hypo-thesisµ = µ0when the varianceσ2is unknown using the statistics( ¯Y, s2) we need a function that is unaltered by the transformations of origin and units of measurement encapsulated in
( ¯Y, s2) to (a + b ¯Y, b2s2) (4.27) and
µ to a + bµ. (4.28)
This is necessarily a function of the standard Student t statistic.
4.5 Some further Bayesian examples
In principle the prior density in a Bayesian analysis is an insertion of additional information and the form of the prior should be dictated by the nature of that evidence. It is useful, at least for theoretical discussion, to look at priors which lead to mathematically tractable answers. One such form, useful usually only for nuisance parameters, is to take a distribution with finite support, in particular a two-point prior. This has in one dimension three adjustable parameters, the position of two points and a probability, and for some limited purposes this may be adequate. Because the posterior distribution remains concentrated on the same two points computational aspects are much simplified.
We shall not develop that idea further and turn instead to other examples of parametric conjugate priors which exploit the consequences of exponential family structure as exemplified in Section2.4.
Example 4.12. Normal variance. Suppose that Y1,. . . , Ynare independently normally distributed with known mean, taken without loss of generality to be zero, and unknown varianceσ2. The likelihood is, except for a constant factor,
1 σnexp
− yk2/(2σ2)
. (4.29)
The canonical parameter isφ = 1/σ2and simplicity of structure suggests takingφ to have a prior gamma density which it is convenient to write in the form π(φ; g, nπ) = g(gφ)nπ/2−1e−gφ/ (nπ/2), (4.30)
defined by two quantities assumed known. One is nπ, which plays the role of an effective sample size attached to the prior density, by analogy with the form of the chi-squared density with n degrees of freedom. The second defining quantity is g. Transformed into a distribution forσ2it is often called the inverse gamma distribution. Also Eπ( ) = nπ/(2g).
On multiplying the likelihood by the prior density, the posterior density of is proportional to
The posterior distribution is in effect found by treating yk2+ nπ/Eπ( )
(4.32)
as having a chi-squared distribution with n+ nπ degrees of freedom.
Formally, frequentist inference is based on the pivotYk2/σ2, the pivotal distribution being the chi-squared distribution with n degrees of freedom. There is formal, although of course not conceptual, equivalence between the two methods when nπ = 0. This arises from the improper prior dφ/φ, equivalent to dσ/σ or to a uniform improper prior for log σ . That is, while there is never in this setting exact agreement between Bayesian and frequentist solutions, the latter can be approached as a limit as nπ → 0.
Example 4.13. Normal mean, variance unknown (ctd). In Section 1.5 and Example 4.12we have given simple Bayesian posterior distributions for the mean of a normal distribution with variance known and for the variance when the mean is known.
Suppose now that both parameters are unknown and that the mean is the parameter of interest. The likelihood is proportional to
1
σnexp[−{n( ¯y − µ)2+ (yk− ¯y)2}/(2σ2)]. (4.33) In many ways the most natural approach may seem to be to take the prior dis-tributions for mean and variance to correspond to independence with the forms used in the single-parameter analyses. That is, an inverse gamma distribution forσ2is combined with a normal distribution of mean m and variance v forµ.
We do not give the details.
A second possibility is to modify the above assessment by replacing v by bσ2, where b is a known constant. Mathematical simplification that results from this stems fromµ/σ2rather thanµ itself being a component of the canonical parameter. A statistical justification might sometimes be thatσ establishes the natural scale for measuring random variability and that the amount of prior
4.5 Some further Bayesian examples 61
information is best assessed relative to that. Prior independence of mean and variance has been abandoned.
The product of the likelihood and the prior is then proportional to
φn/2+nπ/2−1e−wφ, (4.34)
where
w= µ2{n/2 + 1/(2b)} − µ(n¯y + m/b) + g
+ n¯y2/2 + (yk− ¯y)2/2 + m2/(2b). (4.35) To obtain the marginal posterior distribution ofµ we integrate with respect to φ and then normalize the resulting function of µ. The result is proportional to w−n/2−nπ/2. Now note that the standard Student t distribution with d degrees of freedom has density proportional to(1 + t2/d)−d(d+1)/2. Comparison with w shows without detailed calculation that the posterior density ofµ involves the Student t distribution with degrees of freedom n+ nπ− 1. More detailed calculation shows that(µ − ˜µπ)/˜sπ has a posterior t distribution, where
˜µπ = (n¯y + m/b)/(n + 1/b) (4.36)
is the usual weighted mean and˜sπis a composite estimate of precision obtained from(yk− ¯y)2, the prior estimate of variance and the contrast(m − ¯y)2.
Again the frequentist solution based on the pivot (µ − ¯Y)√
n
{(Yk− ¯Y)2/(n − 1)}1/2, (4.37) the standard Student t statistic with degrees of freedom n− 1, is recovered but only as a limiting case.
Example 4.14. Components of variance. We now consider an illustration of a wide class of problems introduced as Example1.9, in which the structure of unexplained variation in the data is built up from more than one component random variable. We suppose that the observed random variable Yks for k = 1,. . . , m; s = 1, . . . , t has the form
Yks = µ + ηk+ ks, (4.38)
whereµ is an unknown mean and the η and the are mutually independent nor-mally distributed random variables with zero mean and variances, respectively τη,τ, called components of variance.
Suppose that interest focuses on the components of variance. To simplify some details suppose thatµ is known. In the balanced case studied here, and only then, there is a reduction to a (2, 2) exponential family with canonical
parameters(1/τ, 1/(τ+ tτη)) so that a simple analytical form would emerge from giving these two parameters independent gamma prior distributions,
While this gives an elegant result, the dependence on the subgroup size t is for most purposes likely to be unreasonable. The restriction to balanced problems, i.e., those in which the subgroup size t is the same for all k is also limiting, at least so far as exact analytic results are concerned.
A second possibility is to assign the separate variance components independ-ent inverse gamma priors and a third is to give one variance componindepend-ent, sayτ, an inverse gamma prior and to give the ratioτη/τan independent prior distri-bution. There is no difficulty in principle in evaluating the posterior distribution of the unknown parameters and any function of them of interest.
An important more practical point is that the relative precision for estimating the upper variance component τη is less, and often much less, than that for estimating a variance from a group of m observations and having m−1 degrees of freedom in the standard terminology. Thus the inference will be more sensitive to the form of the prior density than is typically the case in simpler examples.
The same remark applies even more strongly to complex multilevel models; it is doubtful whether such models should be used for a source of variation for which the effective replication is less than 10–20.
Notes 4
Section4.2. For a general discussion of nonpersonalistic Bayesian theory and