GSB420 - Business Statistics: Final Exam Study Guide

Question 1: A population has a standard deviation of 16. If a sample of size 64 is selected from this population, what is the probability that the sample mean will be within ±2 of the population mean?
a. 0.6826
b. 0.3413
c. -0.6826
d. Since the mean is not given, there is no answer to this question.

Answer:
We need to calculate the z-score for the ±2 interval. In order to do that, we need the standard error of the mean, σ/√n = 16/sqrt(64) = 2.
So when we're asked for the probability that the sample mean is ±2 from the population mean, it's asking for the probability of the mean being within 1 standard error. Even without looking it up in the table, we know that the answer must be A - both from our experience that 68% of the data fall within 1 std dev, and because the other answers are unreasonable.

Question 2: The fact that the sampling distribution of sample means can be approximated by a normal probability distribution whenever the sample size is large is based on the
a. central limit theorem
b. fact that we have tables of areas for the normal distribution
c. assumption that the population has a normal distribution
d. None of these alternatives is correct.

Answer: There's not much to say here. The statement is essentially the definition of the Central Limit Theorem, see page 213. The sample size must be approximately 30 for this to hold for all distributions.

Question 3: A population has a mean of 53 and a standard deviation of 21. A sample of 49 observations will be taken. The probability that the sample mean will be greater than
57.95 is
a. 0
b. .0495
c. .4505
d. .9505

Answer: Find the z-score of this mean: (57.95-53)/(21/sqrt(49)) = 4.95/3 = 1.65. So the question becomes: What's the probability of an observation being more than 1.65 std devs from the mean. You know it can't be much. It's greater than 0. Answer B is the only logical one. Of course, when we go to the cumulative normal distribution table, we find that 1.65 has 0.9505 area, so the area to the right of 1.65 is 0.0495.

Question 4: Suppose a sample of n = 50 items is drawn from a population of manufactured products and the weight, X, of each item is recorded. Prior experience has shown that the weight has a probability distribution with mu = 6 ounces and sigma = 2.5 ounces. Which of the following is true about the sampling distribution of the sample mean if a sample of size 50 is selected?
a) The mean of the sampling distribution is 6 ounces.
b) The standard deviation of the sampling distribution is 2.5 ounces.
c) The shape of the sample distribution is approximately normal.
d) All of the above are correct.

Answer:
A is true. Although when you take a single sample, its mean is not necessarily equal to the population mean, nonetheless, the mean of the sampling distribution (of all samples) will tend toward the population mean as n increases.
B is also not necessarily true. The standard deviation of the sample is not necessarily equal to the population standard deviation. It is usually smaller by a factor of 1/&radicn.
C is not true. The central limit theorem tells us that when the sample size is ≥30, the distribution of the sample mean is approximately normal. However, the shape of the sample distribution itself is not necessarily normal.
D is clearly not true since B and C are not true.

Question 5: The owner of a fish market has an assistant who has determined that the weights of catfish are normally distributed, with mean of 3.2 pounds and standard deviation of 0.8 pound. If a sample of 25 fish yields a mean of 3.6 pounds, what is the Z-score for this observation?
a) 18.750
b) 2.500
c) 1.875
d) 0.750

Answer:
When evaluating the sample mean,
z = (xbar-μ)/(σ/√n) Note: This formula is not on the sheet.
= (3.6-3.2)/(0.8/√25)
= 0.4/0.16
= 2.5
So, answer B is correct.

Question 6: A 95% confidence interval for a population mean is determined to be 100 to 120. If the confidence coefficient is reduced to 0.90, the interval for mu
a. becomes narrower
b. becomes wider
c. does not change
d. becomes 0.1

Answer: No calculations are necessary here. It's completely conceptual. The general rule is: A higher level of confidence requires a wider confidence interval. Therefore, if we reduce the level of confidence to 90%, the confidence interval can be narrower. Answer A is the correct answer.

Exhibit 8-3
The manager of a grocery store has taken a random sample of 100 customers. The average length of time it took these 100 customers to check out was 3.0 minutes. It is known that the standard deviation of the population of checkout times is 1 minute.

Question 7: Refer to Exhibit 8-3. The standard error of the mean equals
a. 0.001
b. 0.010
c. 0.100
d. 1.000

Answer: The standard error of the mean is:
σ/√n = 1/√100 = 1/10 = 0.1
The correct answer is C.

Question 8: Refer to Exhibit 8-3. With a .95 probability, the sample mean will provide a margin of error of
a. 1.96
b. 0.10
c. 0.196
d. 1.64

Answer: The margin of error is the plus/minus term in the confidence interval. In this case, since we know the population standard deviation, the margin of error term is:
z_α/2(σ/√n)
From the z-table, we find that z_0.025 = 1.96
Therefore,
margin of error, E = 1.96(1/√100) = 0.196
Answer C is correct.

Question 12: When the following hypotheses are being tested at a level of significance of α
H₀: μ ≥ 100 H_a: μ < 100
the null hypothesis will be rejected if the p-value is
a. < α
b. > α
c. > α/2
d. < α/2

Answer: First, we notice that this is a one-tailed hypothesis test. The rejection region is entirely to one side of the mean.
Our general rule is If p is low, H₀ must go. So, if p is less than α, we reject the null hypothesis. Answer A is correct.

Question 13: In order to test the following hypotheses at an α level of significance
H₀: μ ≤ 100 H_a: μ > 100
the null hypothesis will be rejected if the test statistic Z is
a. > Z_α
b. < Z_α
c. < -Z_α
d. > Z_α/2

Answer: We've got a one-tailed hypothesis again. This time, the rejection region is in the right-hand tail. Therefore, we reject H₀ if the test statistic is more extreme (i.e. further to the right) than the Z_α. So answer A is correct.

Question 14: Your investment executive claims that the average yearly rate of return on the stocks she recommends is more than 10.0%. She takes a sample to prove her claim. The correct set of hypotheses is
a. H₀: μ = 10.0% H_a: μ ≠ 10.0%
b. H₀: μ ≤ 10.0% H_a: μ > 10.0%
c. H₀: μ ≥ 10.0% H_a: μ < 10.0%

Answer: I don't really like this question because it sounds like she's making a claim based on a status quo of the return rate being > 10%. Since the null hypothesis is about the status quo, I'm tempted to pick answer C. Unfortunately, that's not the right way to look at it in this case.

Rather, since her claim is that the return is greater than 10%, which does not contain an equal sign, that must be the alternative hypothesis, H_a. Therefore, the null hypothesis, H₀, is μ ≤ 10%. Answer B is correct.

Question 15: A soft drink filling machine, when in perfect adjustment, fills the bottles with 12 ounces of soft drink. Any over filling or under filling results in the shutdown and readjustment of the machine. To determine whether or not the machine is properly adjusted, the correct set of hypotheses is
a. H₀: μ > 12 H_a: μ ≤ 12
b. H₀: μ ≤ 12 H_a: μ > 12
c. H₀: μ = 12 H_a: μ ≠ 12

Answer: This one's a gimme. The null hypothesis H₀ is that the machine is continuing to work properly and μ = 12. The alternative hypothesis, H_a is that it is filling with some other mean volume and μ ≠ 12. Correct answer is C.

Question 16: A two-tailed test is performed at 95% confidence. The p-value is determined to be 0.11.
The null hypothesis
a. must be rejected
b. should not be rejected
c. could be rejected, depending on the sample size
d. has been designed incorrectly

Answer: Since the level of significance is 5%, the combined area of the two-tailed rejection region is 0.05. I.e., 0.025 in either tail. The p-value is 0.11. We remember our mantra: If p is low, H₀ must go! But p is not lower than 0.05. Therefore, we do not reject H₀ and answer B is correct.

Question 17: For a one-tailed hypothesis test (upper tail) the p-value is computed to be 0.034. If the test is being conducted at 95% confidence, the null hypothesis
a. could be rejected or not rejected depending on the sample size
b. could be rejected or not rejected depending on the value of the mean of the sample
c. is not rejected
d. is rejected

Answer: Level of significance is 5% = 0.05. p is 0.034. Repeat after me: If p is low, H₀ must go! In this case, yes, p is lower than the level of significance and therefore H₀ is rejected. Answer D is correct.

Note: If this had been a two-tailed test, then the 0.05 rejection region would have been split between the two tails, each having 0.025. In that case, it's not clear whether p = 0.034 is lower than 0.025 unless we know whether p was calculated on one side (as we did in class) or on both sides (as is done in the textbook). I asked Prof. Selcuk about this in an email and he replied that he would avoid such ambiguous cases on the final exam.

Exhibit 9-1
n = 36
xbar = 24.6
S = 12
H₀: μ ≤ 20
H_a: μ > 20

Question 18: Refer to Exhibit 9-1. The test statistic (t-score of xbar) is
a. 2.3
b. 0.38
c. -2.3
d. -0.38

Answer: The formula (on the formula sheet) for the t test statistic is:
t = (xbar - μ₀)/(s/√n)
= (24.6-20)/(12/√36)
= 4.6/2 = 2.3
A is the correct answer.

Question 19: Refer to Exhibit 9-1. If the test is done at 95% confidence, the null hypothesis should
a. not be rejected
b. be rejected
c. Not enough information is given to answer this question.
d. None of these alternatives is correct.

Answer: This question is tricky because we don't know if it's a one-tail or two-tail test. First, assume it's a one-tail test, i.e. the entire rejection region is in one tail. Refer to the t distribution table and look up the t value for 35 degrees of freedom and a 0.05 area in the tail. We find that t value to be approximately 1.69. Our t test statistic is 2.3 which is greater than 1.69, indicating that we should reject the null hypothesis, H₀.

Just to be sure, let's assume that's it's a two-tail test, so the rejection region is only 0.025 on each side. Referring to the t distribution table again, we find the t value for 35 degrees of freedom and a 0.025 area is approximately 2.03. Again, our t test statistic is more extreme than the critical t value. Therefore, reject the null hypothesis, H₀.

Answer B is correct.

Question 20: In regression analysis if the dependent variable is measured in dollars, the independent variable
a. must also be in dollars
b. must be in some units of currency
c. can be any units
d. can not be in dollars

Answer: This is entirely conceptual. The dependent and independent variables are entirely independent of each other. Think of the site.mtw example that we were using extensively in class. The dependent variable was store sales (measured in dollars) and the independent variable was the size of the store (measured in square feet). The correct answer is C - the independent variable can be in any units.

Question 21: In a regression analysis, if SST=4500 and SSE=1575, then the coefficient of determination (R²) is
a. 0.35
b. 0.65
c. 2.85
d. 0.45

Answer: Since SST=SSE+SSR, SSR=4500-1575=2925. And R²=SSR/SST=2925/4500=0.65. Therefore, answer B is correct.

Question 22: Regression analysis was applied between sales (Y in $1,000) and advertising (X in $100), and the following estimated regression equation was obtained.
Y-hat = 80 + 6.2 X
Based on the above estimated regression line, if advertising is $10,000, then the point estimate for sales (in dollars) is
a. $62,080
b. $142,000
c. $700
d. $700,000

Answer: When a question is this easy, you know there's some sort of trick. Watch your units!! Since X is in hundreds of dollars, plug in 100 in the regression equation. Y = 80 + 6.2(100) = 700. Y is in thousands of dollars. Therefore, the point estimate for sales in dollars is $700,000 - answer D.

Question 23: If the coefficient of correlation is a positive value, then
a. the intercept must also be positive
b. the coefficient of determination (R2) can be either negative or positive, depending on the value of the slope
c. the regression equation could have either a positive or a negative slope
d. the slope of the line must be positive

Answer: We learned about the coefficient of correlation way back in Chapter 3. It's a measure of the strength of the linear relationship between x and y. Its values range from -1 to 1. Values close to -1 or 1 indicate a strong linear relationship, either negative or positive.

Answer A is incorrect because the coefficient of correlation tells us nothing about the intercept.
Answer B is incorrect because the coefficient of determination (r²) can only be positive. r² = SSR/SST and both SSR and SST are positive (since they're both sums of squares), so r² must be positive.
Answer C is incorrect because a positive coefficient of correlation indicates a positive relationship which would be modeled with a positive slope.
Answer D is correct.

Exhibit 14-10
The following information regarding a dependent variable Y and an independent variable X is
provided.
∑ X = 16 ∑ (x-xbar)(y-ybar) = -8
∑ Y = 28 ∑ (x-xbar)² = 8
n = 4

Question 24: Refer to Exhibit 14-10. The slope of the regression function is
a. -1
b. 1.0
c. 11
d. 0.0

Answer: On the formula sheet we have the formula for the regression slope, b₁:
b₁ = ∑ (x-xbar)(y-ybar) / ∑ (x-xbar)² = -8/8 = -1.
So answer A is correct.

Question 25: Refer to Exhibit 14-10. The intercept of the regression line is
a. -1
b. 1.0
c. 11
d. 0.0

Answer: Again, the formula sheet gives us the computation for the intercept, b₀:
b₀ = ybar - b₁xbar = (28/4) - (-1)(16/4) = 7 + 4 = 11.
So answer C is correct.

More answers to sample problems to come. (I'm kinda jumping around for now.)

GSB420 - Business Statistics

Monday, March 10, 2008

Final Exam Study Guide - Practice Questions

No comments:

Feedback? Questions?

Subscribe to this Blog

Tags

Blog Archive

Web Resources

My Profile