Hypothesis testing Statistical Inference about Means and Proportions ppt

(1)

Introduction to Hypothesis

testing Statistical Inference

about Means and Proportions

Senior Lectures by:

(2)

RESEARCH

METHOD _{Formulating the HYPOTHESIS}

Test of the Hypothesis Statement of the PROBLEM

1. Recognition of the FACTS 2. Discovery of the Problem

3. Problem Formulation

1. Design Test

(3)

What is a Hypothesis?

Hypothesis Testing for One Population Value:

–

_{Population mean}

₍





a.  (population standard deviation) is given (known):

 Use z/standard normal/bell shaped distribution

b.  (pop std dev) is not given but s (sample std dev) is given

 Use student’s t distribution

–

_{Population proportion (}



)

–

_{Population Variance (}







 Use 2 _{(Chi-Square) distribution. Population Standard Deviation =}

Example: The mean monthly cell phone bill the student of AUCA is  = 10000 Rwf

Example: The proportion of the students of AUCA with cell phones is p = .80  Use z/standard normal/bell shaped distribution

(4)

Hypothesis

Hypothesis: A statistical hypothesis is a statement on a

probabilistic model and a hypothesis test is a method to determine

the possibility of that statement based on a sample.

Presumptions thus often provide the occasion for an

investigation. For this reason it is called research hypothesis.

I a

_ssume

the m

_ean

M

onthly

_{cell ph}

one bill

_{the stu}

dents o

_f

AUCA

(5)

Purpose of hypothesis testing

• _{The purpose of hypothesis testing is to}

determine whether there is enough statistical

evidence in favor of a certain belief about a

parameter.

Example

: Is there statistical

evidence in a random

sample of potential

customers, that support the

hypothesis that more than

10% of the potential

(6)

• States the assumption (numerical) to be tested. This

hypothesis is assumed to be true, and the collected data

will be analyzed to see

if it is contradictory to the null

hypothesis.

Research Hypothesis

“

The mean monthly cell phone bill the student of AUCA is

less than 10000 Rwf”

Example of the Null Hypothesis:

The mean monthly cell phone bill the student of AUCA is at least ten thousand Rwf.

H₀:  10000

• Always contains “=” , “≤” or “” sig • May or may not be rejected

(7)

The Alternative Hypothesis, H

_a

or

H

₁

• Is the opposite of the null hypothesis

Example

– The mean monthly cell phone bill the student of

AUCA is less than 10000 Rwf

( H

_a

:



< 10000)

• Never contains the “=” , “≤” or “



” sign

• May or may not be accepted

• Is generally the hypothesis that is believed (or needs

to be supported) by the researcher.

This is

what

you w

ant to

(8)

Examples: Give the null hypothesis and the alternative

hypothesis

• Is there statistical evidence in a random sample of potential

customers, that support the hypothesis that more than 10% of the

potential customers will purchase a new products?

• You want to show that people find the new design for a recliner

chair more comfortable than the old design.

• You are trying to show that cigarette smoke has an effect on the

quality of a person’s life.

• The mean age of the students enrolled in evening classes at a

certain college is greater than 36 years.

• The mean weight of packages shipped on Air Express during the

past month was less than 36.7 lb.

(9)

The critical concepts are these:

1. There are two hypotheses: the null and the alternative hypotheses.

2. The procedure begins with the assumption that the null hypothesis

is true.

3. The goal is to determine whether there is enough evidence to infer

that the alternative hypothesis is true, or the null is not likely to be

true.

4. There are two possible decisions:

Reject the null: To conclude that there is enough evidence to

infer that the alternative hypothesis is true.

Fail to reject the null: To conclude that there is insufficient

evidence to support the alternative hypothesis.

(10)

Claim:

the mean life

expectancy in Africa is

over than 50.1 years

is 60: x = 60 years

Is X=60 likely if Ho:



≤ 50.1

REJECT

Null Hypothesis

If not likely,

Hypothesis Testing Process

Suppose the

sample mean of the Life expectancy H0:  ≤ 50.1

Ha:  > 50.1

Sample

Population

(11)

Sampling Distribution of x



≤ 50.1

If H₀ is true

_{If it is unlikely that we}

would get a sample

mean of this value ...

... then we reject the null

hypothesis that



≤

50.1

... if in fact this were the population mean…

x=60

Reject H₀ Do not reject H₀

(12)

Level of Significance,

a

• Defines unlikely values of sample statistic if null hypothesis is true

– Defines rejection region of the sampling distribution. • Is designated by a , (level of significance)

– Typical values are .01, .05, or .10

• Is selected by the researcher at the beginning. • Provides the critical value(s) of the test .

a

Normal

distribution

If Alpha equals:

0.1

0.05

0.01 One tail

Critical region

2.33

1.64

1.28 Two-tailed

(13)

H

₀

: μ ≥ 50.1

H

_a

: μ < 50.1

a

0

0 a

/2

a

Upper tail test

Two-tailed test

Rejection

region is

shaded

(14)

Interpreting the p-value…

The smaller the p-value, the more statistical evidence exists

to support the alternative hypothesis.

• _{If the p-value is}

_{less than 1%,}

_{there is}

_overwhelming

evidence

that supports the alternative hypothesis.

• _{If the p-value is}

_{between 1% and 5%,}

_{there is a}

_strong

evidence

that supports the alternative hypothesis.

• _{If the p-value is}

_{between 5% and 10%}

_{there is a}

_weak

evidence

that supports the alternative hypothesis.

• _{If the p-value}

_{exceeds 10%,}

_{there is}

_{no evidence}

_that

(15)

Interpreting the p-value…

Overwhelming

Evidence

(Highly

Significant)

Strong Evidence

(Significant)

Weak Evidence

(Not Significant)

No Evidence

_{(Not Significant)}

(16)

Actual

situation

Our

decision

Null (Ho)

hypothesis is

false

Null (Ho)

hypothesis is

true

Reject the null

(Ho) hypothesis

Correct

decision

Type I

error (α)

Called Level of Significance

Do not reject the

null (Ho)

hypothesis

Type II

error (β)

Correct

decision

(1-β)

(17)

Conclusions of a Test of Hypothesis…

If we reject the null hypothesis, we conclude

that there is enough evidence to infer that the

alternative hypothesis is true.

If we fail to reject the null hypothesis, we

conclude that there is not enough statistical

evidence to infer that the alternative

hypothesis is true.

This does not mean that we

(18)

n

s

μ

x

t

_n_₁





The test statistic is:

Using Small Samples (The

population must be approximately

normal)

(19)

Review: Steps in Hypothesis Testing

• Specify the population value of interest.

• Assumptions

: Randomization, quantitative variable,

normal population distribution (robustness?)

• Formulate the appropriate null and

alternative

hypotheses.

• Specify the desired level of significance.

• Determine the rejection region or p_value

(20)

Example: Lower Tail z Test for Mean

Test the claim that the true mean monthly cellphone bill the

student of AUCA is less than ten thousand Rwf.

Assume σ = 800

1. Specify the population value of interest

Mean monthly cell phone bill the student of AUCA

2. Formulate the appropriate null and alternative hypotheses

 Ho: μ  10000 Ha: μ < 10000 (This is a lower tail test)

3. Specify the desired level of significance

(21)

4. Determine the rejection region

a

= .05

-z_α= -1.64 ₀

This is a one-tailed test with a = .05

Since σ is known, the cut off value is a z value:

Reject H₀ if z < z_a = -1.64 ; otherwise do not reject H₀

En Excel: insert function(fx)<select category Statistical<select function (NORMSINV)<PROBABILITY Write the value (0.05) =1.64

(22)

5. Obtain sample evidence and compute the test statistic

Suppose a sample is taken with the following results:

n = 20, = 9500

(



= 800 is assumed known)

Then the test statistic is:

X

Example: Lower Tail z Test for Mean

80 .

2

795 .

2

178.89

500

20

800 10000

9500

n

σ

μ

x

(23)

Reject H₀ Do not reject H0

a

= .05

0 Z=-2.80

The Z value (Z = -2.80)

is in the rejection region.

We can reject the null

hypothesis in favor of

Ha.

z =-1.64

p-value =.00255

En Excel: insert function(fx)<select category Statistical<select function (NORMSDISTR) Write the value (-2.80) =0.00255

a

We can conclude that the p-value (0.00255) < a = .05

There is evidence that supports the alternative hypothesis, ie the monthly cellphone bill is less than ten thousand Rwf

6. Reach a decision and interpret the result

(24)

• _{Compare the p-value with}

a

–

If p-value <

a

, reject H

₀

–

If p-value



a

, do not reject H

₀

Here: p-value = .00255 a = .05

Since .00255 < .05, we reject the null hypothesis

p-value =.00255

a

= .05

-2.80

-1.64

p-value example

Obtain the p-value from a table or computer

Compare the p-value with a

Reject H₀

(25)

Example: Upper Tail z Test for Mean (



Known)

H

₀

: μ ≤ 55 the average is not over S/55

per month

H

_a

: μ > 55 the average

is

greater than

S/55 per month.

Form hypothesis test:

A manager of mobile telephony in Peru states that the

monthly phone bill has increased more than 55 soles per

month. The company wishes to test this claim.

(Assume



= 25 is known)

(26)

• _{Suppose that}

_a

_{= .10 is chosen for this test}

Find the rejection region:

a= .10

z

_α

=1.28

0

Reject H

₀

Reject H₀ if z > 1.28

En Excel: insert function(fx)<select category Statistical<select function (NORMSINV) <PROBABILITY. Write the value (0.10) =1.28

(27)

Obtain sample evidence and compute the test

statistic

Suppose a sample is taken with the following

results:

n = 80, x = 59

(



=25 was assumed known)

–

_{Then the test statistic is:}

43 .

1

80

25

55

59 n

σ

μ

x

z











(28)

Example: Decision

a= .10

1.28

0

Reject H

₀

Reject H₀ since z = 1.43  1.28

i.e.: there is enough evidence to say that the monthly phone bill has increased in Peru

z = 1.43

(29)

Reject H₀

a= .10

Do not reject H₀

1.28

0

Reject H₀

z = 1.43

Calculate the p-value and compare to

a

p

-Value Solution

Reject H

₀

since p-value = .0764<

a

= .10

En Excel: insert function(fx)<select category Statistical<select function

(30)

Example: Two-Tail Test (



Unknown)

The average cost per room per night for hotels Serena in Africa

is said to be $200 per night. A random sample of 20 hotels

resulted in x= $250 and s = $115 Test at the

a

= 0.05 level.

(Assume the population distribution is normal)

H

₀

:

μ



= 200

(31)

Since t = 1.94 is not greater than 2.093, and nor

less than -2.093, we cannot the null hypothesis in

favor of Ha. That is there is insufficient evidence

that true mean cost is different than $200

Reject H₀ Reject H₀

a

/2=.025

-t

_α/2=

Do not reject H₀

0

_t

α/2=

a

/2=.025

-2.093 2.093 1.94 20 115 200 250 n s μ x

t₂₀_₁     

1.94

H

₀

:

μ



= 200

H

_a

:

μ



200 Example Solution: Two-Tail Test

En Excel: insert function(fx)<select category Statistical<select function (TINV) > Ok Write in probability (0.05) and in Deg_freedom (19) > OK. Then returns the inverse of

 a= 0.05

 n = 20

  is unknown, so use a t statistic

 Critical Value:

t₁₉= ± 2.093

 P_value = 0.0500

(32)

Example:

A pharmaceutical company conducts research on the efficacy of a vaccine against measles. The variable considered is the antibody titers produced by the vaccine. The vaccine

produced by another laboratory reports an average titer of antibodies 1.9. To test whether the new vaccine is more effective than the older vaccine, the shot was given to 16 volunteers and obtained the following results:

Average titer of antibody 3 2.5 2.4 1.9 1.8 1.5 2.6 2.7 3.1 1.7 2.3 2.2

Steps using SPSS

To do this, click on Analyze, and then

Compare means

followed by One-Sample T test

(33)

SPSS Report (recommendation: work made by your hand and

checks with the same results).

One-Sample Statistics

N Mean Std. Deviation Std. Error Mean Average titer of

antibody 16 2.225 0.5183 0.1296

One-Sample Test

Test Value = 1.9

t df

Sig.

(2-tailed) Mean Difference

95% Confidence Interval of the Difference

Lower Upper Average titer

of antibody 2.508 15 0.024 0.325 0.049 0.601

Note 1: The value of p, or Sig gives us the SPSS default is bilateral, unilateral if we value: Sig /2 (.024/2 = 0.012)

Interpretation: This result indicates that the data are consistent with an average value greater than 1.9, because the difference found is highly significant

(34)

Hypothesis

_{Tests for Proportions}

• Involves categorical values

• Two possible outcomes

– “Success” (possesses a certain characteristic)

– “Failure” (does not possesses that characteristic)

• Fraction or proportion of population in the “success” category is

denoted by p

• Sample proportion in the success category is denoted by

• When both np and n(1-p) are at least 5, can be approximated

by a normal distribution with mean and standard deviation.

pˆ

size

sample

in

successes

of

number

ˆ



n

x

p

μ

_pˆ



n

p)

p(1

σ

_pˆ





(35)

Example: z Test for Proportion

Check:

n p = (144)(.80) = 115.2

n(1-p) = (144)(.20) = 28.8

A year ago the proportion of

students who had Cellular

was 80%. It is believed that

this

proportion

has

increased this year to check

this hypothesis, take a

random sample of 144

students and it was found

that 89% had cellular

Test at the

a

= .05

(36)

Z

Test for Proportion: Solution

a

= .05

n = 144, = =.89

Reject H₀ at a = .05

H

₀

: p .80

H

_a

: p

>

.80

Critical Values: ± 1.64

Decision:

Conclusion:

There enough evidence to reject Ho, ie the number of students who have mobile phones has increased.

2.70

144 .80)

.80(1

.80

.89

n

p)

p(1

p

ˆ

z











p

pˆ



Reject H₀

a= .05

Do not reject

H₀

1.64

0

Reject H₀

(37)

p

-Value Solution

a= .05

Reject H₀ Do not reject H₀ _1.64

0

Reject H

₀

z = 2.70

Reject H₀ since p-value = .0035 < a = .05

En Excel: insert function(fx)<select category Statistical<select function

(38)

A marketing company claims that it receives  =

4%

responses from its Mailing. To test this claim, We take a random sample of n = 500 were surveyed with x = 25 responses.

Test at the a = .05 significance

Test for Proportion: Example

H

₀

: p = .04

H

Critical Values:  1.96

Decision: Do no reject Ho at a = .05

(39)

Summary: Steps for a t test for a single sample

Restate the question as a research hypothesis and a null hypothesis about the populations.

Collect data.

Determine the characteristics of the comparison distribution. Choose Method (Z test or t test).

The mean or proportion is known of the population. Compute the standard deviation or proportion by: *Previous studies or research or

*Take a pilot sample or

*Calculate the variance of the distribution of means (S2 _/n)

Take the square root, to get SE.

Note, we’re calculating t with N-1 df.

Determine the cut off sample score on the comparison distribution at which the null hypothesis should be rejected.

Decide on an alpha and one-tailed vs. two-tailed Look up the critical value in the table.

Determine your sample’s t score or Normal value distribution.

Decide whether to reject or not reject the null hypothesis. (If the observed value of “t” or “Z” exceeds the table value, reject.)