• No se han encontrado resultados

5.2. Estrategia de Refinamiento

5.2.2. Métodos de Refinamiento

...

step 3: Decide on a one-tail or two-tail test. If the hypoth-esis being tested is that the average has or has not increased or decreased, choose a one-tail test. If the hypothesis being tested is that the average has or has not changed, choose a two-tail test.

step 4: Use Table 11.2 or the standard normal table to determine the z-value corresponding to the con-fidence level and number of tails.

step 5: Calculate the actual standard normal variable, z0.

z0¼ x 

ffiffiffin p

11:88

step 6: If z0 z, the average can be assumed (with confidence level C) to have come from a differ-ent distribution.

Example 11.16

When it is operating properly, a cement plant has a daily production rate that is normally distributed with a mean of 880 tons/day and a standard deviation of 21 tons/day. During an analysis period, the output is measured on 50 consecutive days, and the mean output is found to be 871 tons/day. With a 95% confidence level, determine whether the plant is operating properly.

Solution step 1: Given.

step 2: C = 95% is given.

step 3: Since a specific direction in the variation is not given (i.e., the example does not ask whether the average has decreased), use a two-tail hypothesis test.

step 4: The population mean and standard deviation are known. The standard normal distribution may be used. From Table 11.2, z = 1.96.

step 5: From Eq. 11.88,

z0¼ x 

ffiffiffin p

¼ 871 880 21ffiffiffiffiffi p50

¼ 3:03

Since 3:03 > 1:96, the distributions are not the same.

There is at least a 95% probability that the plant is not operating correctly.

32. APPLICATION: STATISTICAL PROCESS CONTROL

All manufacturing processes contain variation due to random and nonrandom causes. Random variation

cannot be eliminated. Statistical process control (SPC) is the act of monitoring and adjusting the performance of a process to detect and eliminate nonrandom variation.

Statistical process control is based on taking regular (hourly, daily, etc.) samples of n items and calculating the mean, x, and range, R, of the sample. To simplify the calculations, the range is used as a measure of the dispersion. These two parameters are graphed on their respective x-bar and R-control charts, as shown in Fig. 11.7.7 Confidence limits are drawn at ±3= ffiffiffi

pn . From a statistical standpoint, the control chart tests a hypothesis each time a point is plotted. When a point falls outside these limits, there is a 99.75% probability that the process is out of control. Until a point exceeds the control limits, no action is taken.8

33. LINEAR REGRESSION

If it is necessary to draw a straight line ðy ¼ mx þ bÞ through n data points ðx1; y1Þ; ðx2; y2Þ; . . . ; ðxn; ynÞ, the following method based on the method of least squares can be used.

step 1: Calculate the following nine quantities.

å

xi

å

x2i 

å

xi2 x¼

å

xi

n

å

xiyi

å

yi

å

y2i 

å

yi2 y¼

å

yi

n

7Other charts (e.g., the sigma chart, p-chart, and c-chart) are less common but are used as required.

8Other indications that a correction may be required are seven mea-surements on one side of the average and seven consecutively increas-ing measurements. Rules such as these detect shifts and trends.

Figure 11.7 Typical Statistical Process Control Charts

UCL–x x

R LCL–x

UCLR

t

t

BackgroundandSupport

step 2: Calculate the slope, m, of the line.

m¼n

å

xiyi

å

xi

å

yi

n

å

x2i 

å

xi2 11:89

step 3: Calculate the y-intercept, b.

b¼ y  mx 11:90

step 4: To determine the goodness of fit, calculate the correlation coefficient, r.

r¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n

å

xiyi

å

xi

å

yi

n

å

x2i 

å

xi2n

å

y2i 

å

yi2

s

11:91

If m is positive, r will be positive; if m is negative, r will be negative. As a general rule, if the absolute value of r exceeds 0.85, the fit is good; otherwise, the fit is poor. r equals 1.0 if the fit is a perfect straight line.

A low value of r does not eliminate the possibility of a nonlinear relationship existing between x and y. It is possible that the data describe a parabolic, logarithmic, or other nonlinear relationship. (Usually this will be apparent if the data are graphed.) It may be necessary to convert one or both variables to new variables by taking squares, square roots, cubes, or logarithms, to name a few of the possibilities, in order to obtain a linear relationship. The apparent shape of the line through the data will give a clue to the type of variable transforma-tion that is required. The curves in Fig. 11.8 may be used as guides to some of the simpler variable transformations.

Figure 11.9 illustrates several common problems encountered in trying to fit and evaluate curves from experimental data. Figure 11.9(a) shows a graph of clustered data with several extreme points. There will be moderate correlation due to the weighting of the extreme points, although there is little actual correlation at low values of the variables. The extreme data should be excluded, or the range should be extended by obtain-ing more data.

Figure 11.9(b) shows that good correlation exists in general, but extreme points are missed, and the overall correlation is moderate. If the results within the small linear range can be used, the extreme points should be excluded. Otherwise, additional data points are needed, and curvilinear relationships should be investigated.

Figure 11.9(c) illustrates the problem of drawing con-clusions of cause and effect. There may be a predictable relationship between variables, but that does not imply a cause and effect relationship. In the case shown, both variables are functions of a third variable, the city population. But there is no direct relationship between the plotted variables.

Figure 11.8 Nonlinear Data Curves y

a

x y = ce–bx

y = aebx

y = 1a + bx

y

a

x y = a + b x

y = a + bx2

y = a + bx + cx2 y = a + bx + cx2 + dx3 log y = a + bx + cx2 + dx3 log y = a + bx + cx2

y = a + b log x y

a

x y

a

a

a

x y

x y

x

c

Figure 11.9 Common Regression Difficulties

(a)

(b)

(c) amount of whiskey consumed in the city number of

elementary school teachers in the city

P R O B A B I L I T Y A N D S T A T I S T I C A L A N A L Y S I S O F D A T A

11-17

Backgroundand Support

Example 11.17

An experiment is performed in which the dependent variable, y, is measured against the independent vari-able, x. The results are as follows.

x y

1.2 0.602

4.7 5.107

8.3 6.984

20.9 10.031

(a) What is the least squares straight line equation that best represents this data? (b) What is the correlation coefficient?

Solution

(a) Calculate the following quantities.

å

xi¼ 35:1

å

yi¼ 22:72

å

x2i ¼ 529:23

å

y2i ¼ 175:84



å

xi2¼ 1232:01



å

yi2¼ 516:38

x¼ 8:775 y¼ 5:681

å

xiyi¼ 292:34

n¼ 4 From Eq. 11.89, the slope is

m¼n

å

xiyi

å

xi

å

yi

n

å

x2i 

å

xi2 ¼

ð4Þð292:34Þ  ð35:1Þð22:72Þ ð4Þð529:23Þ  ð35:1Þ2

¼ 0:42

From Eq. 11.90, the y-intercept is

b¼ y  mx ¼ 5:681  ð0:42Þð8:775Þ

¼ 2:0

The equation of the line is

y¼ 0:42x þ 2:0

(b) From Eq. 11.91, the correlation coefficient is r¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n

å

xiyi

å

xi

å

yi

n

å

x2i 

å

xi2n

å

y2i 

å

yi2

s

¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið4Þð292:34Þ  ð35:1Þð22:72Þ ð4Þð529:23Þ  1232:01



ð4Þð175:84Þ  516:38 vu

ut

¼ 0:914

Example 11.18

Repeat Ex. 11.17 assuming the relationship between the variables is nonlinear.

Solution

The first step is to graph the data. Since the graph has the appearance of the fourth case in Fig. 11.8, it can be assumed that the relationship between the variables has the form of y¼ a þ b log x. Therefore, the variable change z = log x is made, resulting in the following set of data.

z y

0.0792 0.602 0.672 5.107 0.919 6.984 1.32 10.031

If the regression analysis is performed on this set of data, the resulting equation and correlation coefficient are

y¼ 7:599z þ 0:000247 r¼ 0:999

This is a very good fit. The relationship between the variable x and y is approximately

y¼ 7:599 log x þ 0:000247

BackgroundandSupport

...

...

...