Sunday, February 17, 2008

Lecture 6 - Ch 9 - Hypothesis Testing

Null and Alternative Hypotheses

Hypothesis testing involves first creating a hypothesis (usually called the "null hypothesis"), H0, and an alternative hypothesis, Ha (also sometimes denoted H1) which is the opposite of the null hypothesis.

For example, we may have historical data about the mean of the population, μ. Our null hypothesis may be that the population mean is still μ. In this case, the alternative hypothesis is that the mean is not equal to μ.

The null hypothesis is always one of status quo - that a parameter is equal to a known, historical value. (As we'll see later, the null hypothesis may be that a parameter is equal to or less than or equal to or greater than a value. In any case, there's always an equal sign in the null hypothesis and never in the alternative hypothesis.) Both hypotheses are always stated about a population parameter, not a sample statistic.

Confidence Level

After stating the null and alternate hypotheses to be tested, our next step is to determine the confidence level of the hypothesis test. This is usually 90, 95 or 99%, depending on how certain we want to be about our rejection or non-rejection of the null hypothesis. In order to determine the confidence level, we consider the level of significance, alpha. The level of significance, alpha, is the probability that we will reject the null hypothesis even though it is true. Not good! This is known as a Type I error and we typically want to minimize it to 0.10, 0.05, or 0.01. The complement of the level of significance is the confidence coefficient, 1-alpha, usually 0.90, 0.95 or 0.99, and represents the probability that we will not reject the null hypothesis when it is true. That would be good! The confidence level is the confidence coefficient stated as a percentage - 90%, 95% or 99%.

In order to decide whether to accept or reject the null hypothesis, we take a sample and calculate its mean. Then we construct a confidence interval around the population mean with the given confidence level. If the sample mean falls within the confidence interval, we do not reject the null hypothesis. If the sample mean falls outside the confidence interval, we reject the null hypothesis in favor of the alternative hypothesis.

Waveland ClocktowerCritical Value Method

Example:
H0: population mean (mu) = 160
Ha: mu does not equal 160
level of significance = 0.10
confidence coefficient = 0.90
confidence level = 90%
sample size (n) = 36
sample mean (xbar) = 172
population std dev (sigma) = 30

Rather than go through all the calculations of constructing the confidence interval, we can just look at the z value of the sample mean compared to the z value of the confidence interval. This is known as getting the critical value. Big note: This only works if you know the population standard deviation! If you don't, you'll need to use the t-distribution and get the t-value using the sample std dev.

For the 90% confidence interval: z0.05 = -1.65
For the sample mean: z = (160-172)/(30/sqrt(36)) = -12/5 = -2.4

The important part to remember here is that the z-score that we calculate is zalpha/2. We use alpha/2 because the rejection region is divided into two halves on either side of the distribution. I.e. we would reject the null hypothesis is our sample mean was too high or too low.

In this case, the non-rejection region is from -1.65 to +1.65. Since the z score of the sample mean is outside the non-rejection region of the 90% interval (it's less than -1.65), we say that the sample leads us to reject the null hypothesis.

Example 2:
same as example 1 except - sample mean (xbar) = 165

In this case, the z score of the sample mean is (160-165)/(30/sqrt(36)) = -5/5 = -1

Since the z score of the sample mean is within the non-rejection region of the 90% interval (i.e. it's between -1.65 and +1.65), we say that we cannot reject the null hypothesis. It doesn't really tell us that we should accept the null hypothesis, but we don't have sufficient evidence to reject it.

p-value Method
Another equivalent way of evaluating the null hypothesis is the p-value method. The p-value is the area under the normal distribution curve over a given value.

For example 2 above, we find that from the normal distribution table the area under the curve from the mean to 1.0 (the z-score of the sample mean) is 0.34. Therefore, the p-value of the sample mean (1.0) is 0.5-0.34 = 0.16

We compare this value to the area under the curve above our confidence interval, which is alpha/2 = 0.05. Since the p-value of the sample value is greater then alpha/2, we do not reject the null hypothesis, H0. If we find that the p-value of the sample mean is less than alpha/2, we would reject H0.

As our book says: If the p-value is low, H0 must go!

It seems that the advantage of using the p-value method is that it only requires one lookup into the normal distribution table, whereas the critical value method requires two lookups.

No comments: