Hypothesis tests
1. What is a hypothesis test?
2. Elementos of a test: null and alternative hypotheses, types of error, significance level, critical region
3. Tests for the mean of a normal population 4. Tests for a proportion
5. Other tests
Recommended reading:
Capítulos 22 y 23 del libro de Peña y Romo (1997)
8.1: What is a hypothesis test?
A hypothesis is an affirmation made about the population.
The hypothesis is parametric if it refers to the values taken by one of the population parameters.
For example: “the population mean is positive” (μ > 0).
A hypothesis test is a statistical technique for judging whether or not the data provide evidence to confirm or reject the hypothesis.
Example
Después de la decisión de salir de Kosovo, es natural pensar que la popularidad de la Ministra de Defensa haya bajado.
Se midieron las valoraciones de Carme de 10 estudiantes antes y después de la crisis y las diferencias son:
-2, -0.4, -0.7, -2, +0.4, -2.2, +1.3, -1.2, -1.1, -2.3
La mayoría de los datos son negativos, pero ¿proporcionan estos datos evidencia de que el nivel medio de popularidad de Carme ha bajado?
La media estimada a partir de los datos es x = -1.02.
¿Refleja esta estimación un auténtico descenso en el nivel medio de popularidad? ¿Se debe el resultado a razones puramente aleatorias?
8.2: Elements of a hypothesis test
The hypothesis for which we wish to find evidence is called the alternative or experimental hypothesis. This is denoted by H1.
< 0
The contrary hypothesis to H1 is called the null hypothesis. This is denoted by H0.
= 0
As we want to see whether Carmes mean rating really has gone down, then we wish to test H0 : μ = 0 versus H1 : μ < 0
The basis reasoning for carrying out a hypothesis test is as follows:
1. Suppose that H0 is true, μ= 0.
2. Is the result obtained from the data (x = -1.02) very unlikely given this hypothesis?
3. If it is unlikely, the data provide evidence against H0 and in favour of H1.
To carry out this analysis, we need to see which values we would expect x to take if H0 were true.
To simplify the problem, suppose that the population is normally distributed with known variance = 1.
Recall that
If H0 is true, we have
To see if the observed mean is compatible with = 0, calculate
And compare this value with the standard normal distribution.
As -3,2255 is a very improbable value for the N(0, 1) distribution, from the tables, we have that P(Z < -3.2255) < 0.001), the data provide strong
evidence against H0 and in favour of H1.
Types of error
H
0is true H
1is true
Don’t reject H
0Correct decision Type II error Reject H
0Type I error Correct decision
Which of the two errors is more serious?
The significance level and the critical region We can control the type I error by fixing (a priori) the significance level, = P(reject H0|H0 is true)
Typical values are = 0,1 or 0,05 or 0,01.
Given the significance level, the critical or rejection region is the set of values of the test statistic such that H0 is rejected.
Let = 0.05. Then H0 is rejected if
that is if the sample mean is below -0.52. Poniendo = 0.025
The p-value
For small values of , it is more difficult to reject the null hypothesis. The minimum value of for which H0 would be rejected given the data is called the p-value.
The p-value is interpreted as a measure of the statistical evidence in favour of H1 (or against H0): when the p-value is small, this represents strong evidence in favour of H1.
Zp = 3.2255 implies that p = 0.00063. There is lots of evidence against H0 and in favour of H1.
8.3: Tests for the mean of a normal population (known variance)
H0 H1 Rejection region
=
0≤
0=
0≥
0=
0≠
0Onesidedtests Twosidedtest
8.4: Tests for a proportion
H0 H1 Rejection region
p = p
0p ≤ p
0p = p
0p ≥ p
0p = p
0p ≠ p
0Onesidedtests Twosidedtest
Example
In the last elections, 40% of getafenses voted for the PSOE. In a study of 100 people, 37 said they would vote PSOE next time.
Calculate a 95% confidence interval for the probability that a getafense says they will vote PSOE in the next election.
Is there any evidence to suggest that this probability is below 0,4? Use a 5% significance level.
8.5: Other tests
1. Mean of a normal population (unknown variance) 2. Difference in the means of 2 normal populations
a) Known variances
b) Unknown but equal variances c) Unknown variances