• No se han encontrado resultados

Saneamiento de agua

Chapas

P. total Ramal 1: 133,46 kW

6. Saneamiento de agua

is the goal, a hyperprior that places too much mass on large

result in intervals that are a bit too long. A lighter-tailed hyperprior for such as an inverse gamma, or the bias correction method for EB confidence intervals described in Section 3.5, would likely produce better results.

(4.17)

= 0 versus > 0. Standard the n that achieves Type I error a satisfies

values will

where

there is no fixed value of n that guarantees

Example 4.1 Consider comparison of two Poisson distributions, one with rate

For all α and

Formulas (4.17) and (4.19) produce a sample size that ensures expected power is high. one could find a sample size that, for example, makes P[power >

Shih (1995) applies a generalization of this approach to clinical trial design when the endpoint is the time to an event. Prior distributions can be placed on the treatment effect, event rates, and other inputs.

Solving for

approach. However, if G is Since the required n depends on

based on either a typical or a conservative value of

can produce too small a sample size by underrepresenting uncertainty in while the latter can be

Now consider the Bayesian design for this frequentist analysis. Given a prior

Monte Carlo approach wherein we draw

sequently apply formula (4.17) to each outcome. This produces a collection of sample sizes

optimal sample size resulting from prior uncertainty. Adding a prior for as well creates no further complication.

A slightly more formal approach would be to set the power under the chosen prior g equal to the desired level, and solve for

where

of this distribution. The standard frequentist design produces

For this conditional model, given respectively,

and

for

Irrespective of whether one is a Bayesian or frequentist, inference can be based on the conditional distribution of

is ancillary for (and Assume interest lies in the parameter and the other with rate

or a monotone transformation thereof. With data and if

satisfies

in (4.19) requires interval-halving or some other recursive the distribution of which reflects the uncertainty in the

i.e.,

for the treatment effect, Spiegelhalter (2000) suggests a simple j = 1... and sub-a frequentist design will typicsub-ally be However, the former is the standard normal cdfand, as usual, is the upper -point

(4.18)

(4.19)

(4.20)

= 0. then

is the sample size and

will be large.

However. Bayesian design for Bayesian analysis puts priors on both and finds a sample size that controls preposterior performance. Bayesian design for frequentist analysis puts a prior on

which produces acceptable frequentist performance in the conditional bino-mial distribution with parameters

or otherwise using the distribution of

and Louis (1985) and Louis and Bailey (1990) apply these ideas to reducing problems of multiple comparisons in the analysis of rodent tumorigenicity experiments.

If sequential stopping is a design option, one can stop at such that

prior on

4.5.2 Bayesian design for Bayesian analysis

Having decided that Bayesian methods are appropriate for use at the design stage, it becomes natural to consider them for the analysis stage as well.

But in this case, we need to keep this new analysis plan in mind as we design the experiment. Consider for instance estimation under SEL for the model

where

statistic

with posterior and preposterior risk by

As expected, increasing C decreases n, with the opposite relation for the usual frequentist design.

For many models, the calculations can often be accomplished in closed form, as the following two examples illustrate.

Example 4.2 Consider first the model of Example 2.2, where both G and f are normal with variances

where

neither on the prior mean nor the observed data and (4.21) produces

where this last expectation is taken with respect to the marginal distribu-tion of y. We begin with the following simple design goal: find the smallest sample size n such that

averaged over the distribution of and finds a sample size n

and

to pick a sample size. Carlin

the first n gives acceptable frequentist or Bayesian performance. The allows computation of the distribution of

has prior distribution G, and f is an exponential family with The Bayes estimate is then

and respectively. Then This risk depends

Thus

As ( 4.21)

Example 4.3 Now consider the Poisson/gamma model of Example 2.5, where we assume that G is gamma with mean

Then

and

Plugging this into (4.21), we obtain

For this model, the optimal Bayesian design depends on both the prior mean and variance. As

which is the frequentist design for

Using preposterior Bayes risk as the design criterion produces a suffi-cient sample size for controlling the double integral over the sample and parameter spaces. Integrating first over the sample space represents the preposterior risk as the frequentist risk R( -eter space. The Bayesian formalism brings in uncertainty, so that generally the sample size will be larger than that obtained by a frequentist design based on a "typical" (i.e. expected) parameter value, but perhaps smaller than that based on a "conservative" value.

A fully Bayesian decision-theoretic approach to the design problem in-vokes the notions of expected utility and the cost of sampling. While many of the associated theoretical considerations are beyond the scope of our book, the basic ideas are not difficult to explain or understand; our

discus-,

sion here follows that in Müller and Parmigiani (1995). Suppose

U(n, y, )

the payoff from an experiment having sample size n, data y = and parameter

ment. A reasonable choice for U(n, y, ) function; i.e., under SEL we would have U(n, y,

might be measured simply on a per sample basis, i.e., C(n, y,

some c > 0. Then a plausible way of choosing the best sample size nwould be to maximize the expected utility associated with this experiment,

where, as usual,f(yI )

we are choosing the design which maximizes the net payoff we expect using this sample size n, where as before we take the preposterior expectation over both the parameter

the per sample cost c > 0,

and

the prior variance increases to infinity and

S„.) averaged over the

param-while C(n, y, is the cost associated with this experi-is

might be the negative of the loss . Cost cn for

the chosen prior. That is, is the likelihood and g( )

and the (as yet unobserved) data y. Provided (n) will typically be concave down, with

and variance

Example 4.4 and Parmigiani (1995) consider optimal selection of the sample size for a binomial experiment. Let

likelihood

(a noninformative choice). If we adopt absolute error loss (AEL), an ap-propriate choice of payoff function is

where

pendix B, problem 3. Adopting a sampling cost of .0008 per observation, Repeating this calculation a grid of n values {

obtain a pointwise Monte Carlo estimate of the true curve. The optimal sample size n is the one for

When exact pointwise evaluation of

ing due to the size of I or the dimension of (y, giam (1995) recommend simply drawing a single

distribution for a randomly selected design

metric (e.g., nonlinear regression) or nonparametric (e.g.. loess smoothing) methods, we may then fit a smooth curve

and finally take

(4.22) we are essentially using a Monte Carlo sample size of just N = 1.

However, results are obtained much faster, due to the dramatic reduction in function evaluations, and may even be more accurate in higher dimensions, where the "brute force" Monte Carlo approach (4.23) may not produce a

curve smooth enough for a deterministic optimization subroutine.

expected utility at first increasing in n but eventually declining as the cost of continued sampling becomes prohibitive. Of course, since it maybe

to assess c in practice, one might instead simply set c = 0, meaning that

first sample size that delivers a given utility. to the constrained minimization in condition (4.21) above

Another is that one or both of the integrals in may be intractable and hence require numerical integration methods. Fortunately, Monte Carlo draws are readily by simple composition: similar to the process outlined in Figure 3.3. we repeatedly draw

followed by

j= 1,... The associated Monte Carlo estimate of

is the posterior median, the Bayes rule under AEL; see Ap-and be the chosen prior - say, a Uni f (0, 1) by the usual binomial Notice that by ignoring the integrals in to the resulting point cloud, versus Using either traditional para-and plotting the integrpara-ands producing a joint preposterior sample from

from would increase without bound. could then select n as the

( 4.23)

~

4.6 Exercises

1. In the bivariate Gaussian example of Subsection 4.1.2, show that if I, then B has the form (4.2).

2. In the beta/binomial point estimation setting of Subsection 4.2.2, where

posterior. Again,

and Parmigiam (1995) actually note that in this case the θ integral can be done in closed form after interchanging the order of the integral and the sum in (4.24), so Monte Carlo methods are not actually needed at all.

Alternatively, the quick estimate using N = 1 and smoothing the resulting point cloud performs quite well in this example.

In summary, our intent in this chapter has been to show that procedures derived using Bayesian machinery and vague priors typically enjoy good frequentist and EB (preposterior) risk properties as well. The Bayesian ap-proach also allows the proper propagation of uncertainty throughout the various stages of the model, and provides a convenient roadmap for obtain-ing estimates in complicated models. By contrast, frequentist approaches tend to rely on clever "tricks," such as statistics and pivotal quan-tities, that are often precluded in complex or high-dimensional models. All of this suggests the use of Bayesian methodology as an engine for producing procedures that will work well in a variety of situations.

However, there is an obvious problem with this "Bayes as procedure en-gine" school of thought: our ability (or lack thereof) to obtain posterior dis-tributions in closed form. Fortunately, the recent widespread availability of high-speed computers combined with the simultaneous rapid methodologi-cal development in integration via approximate and Monte Carlo methods has provided an answer to this question. These modern computational ap-proaches, along with guidance on their proper use and several illustrative examples, are the subject of the next chapter.

(4.22) then becomes

(4.25) can be derived numerically from the

I,j = I.... N, and obtain

Here the y sum is over a finite range, so Carlo integration is needed only for

draw

(4.24)

We thus set up a of values from 0 to 120, and for each,

(a) For

has smaller risk than the MLE.

(b) Find the Bayes rule when the loss function is

3. In the beta/binomial interval estimation setting of Subsection 4.3.1, (a) Under what condition on the prior will the HPD interval for

one-sided when X = n? Find this

(b) Outline a specific algorithm to find the HPD interval a = b = 1, n = 5, and x = 1. percentage reduction in width does the HPD interval offer the equal tail interval in this case?

4. Repeat the analysis plan presented for the beta/binomial in Subsec-tion 4.3.1 for the Gaussian/Gaussian model.

(a) Show that without loss of generality you can take the prior mean

= 0) and the sampling

(b) Show that the highest posterior probability interval is also equal-tailed.

(c) Evaluate the frequentist for combinations of the prior vari-ance

(d) Also, evaluate the Bayesian posterior coverage and pre-posterior cov-erage of the interval based on

equal to 0.

( Hint: All probabilities can be represented as Gaussian integrals, and the interval endpoints come directly from the Gaussian cumulative

distribution function.)

5. Show that if the sampling variance is unknown and has an inverse gamma prior distribution, then in the limit as information in this inverse gamma

to 0 and as

6. Actually carry out the analysis in Example 4.4, where the prior (4.22) is

(a) the Uni f (0,1) prior used in the example, (b) a mixture of two Beta priors,

What is the optimal sample size n in each case? Is either prior amenable to a fully analytical solution (i.e, without resort to Monte Carlo sam-pling)?

7. Using equations (4.18) and (4.20), verify that hold for all G? (Hint: see equation (4.19)).

be

Use your algorithm to find the HPD interval when

=1).

and the parameter of interest

= 0, when the true prior mean is not

Bayesian intervals are Student's t-intervals.

in

= 1;

Will this relation

= .5 and general n and M, find the region where the Bayes rule

~