GSB420 - Business Statistics: Final Exam Study Guide - Practice Questions

In this post, I'll go over the answers to the "regular" questions from the last quarter's final. I'll also note which chapter the question is from.

Question 1 (Chapter 12): You would like to estimate the income of a person based on his age. The following data shows the yearly income (in $1,000) and age of a sample of seven individuals.
Income (in $1,000) Age
20                 18
24                 20
24                 23
25                 34
26                 24
27                 27
34                 27
a. Develop the least squares regression equation.
b. Estimate the yearly income of a 30-year-old individual.

Answer:
a. In order to calculate b₀ and b₁, we need to first calculate the mean of X (age) and Y (income). For xbar, I calculated 24.71 and for ybar, I got 25.71. To calculate b₁, we need to calculate x_i-xbar and y_i-ybar for each i:

Income Age x_i-xbar y_i-ybar (x_i-xbar)(y_i-ybar) (x_i-xbar)²

20     18  -6.71   -5.71        38.31          45.02
24     20  -4.71   -1.71         8.05          22.18
24     23  -1.71   -1.71         2.92           2.92
25     34   9.29   -0.71        -6.60          86.30
26     24  -0.71    0.29        -0.21           0.50
27     27   2.29    1.29         2.95           5.24
34     27   2.29    8.29        18.98           5.24

The sum of the (x_i-xbar)(y_i-ybar) is 64.4. The sum of the (x_i-xbar)² is 167.4. Therefore, b₁ is 64.4/167.4 = 0.38.
We can also calculate b₀ = ybar - b₁xbar = 25.71 - (0.38)(24.71) = 16.2.
Therefore, the regression equation is y = 16.2 + 0.38x.

b. Use the equation to estimate y for x=30:
y = 16.2 + 0.38(30) = 27.6, which is $27,600 annual income.

Question 2 (Chapter 12): Below you are given a partial computer output based on a sample of 8 observations, relating an independent variable (x) and a dependent variable (y).
              Coefficient Standard Error
Intercept     13.251      10.77
X             0.803       0.385

Analysis of Variance
SOURCE            SS
Regression
Error (Residual)  41.674
Total             71.875
a. Develop the estimated regression line.
b. At α = 0.05, test for the significance of the slope.
c. Determine the coefficient of determination (R²).

Answer:
a. This one's a lot easier than #1. No calculations necessary, just the ability to pull b₀ and b₁ out of the computer output. They're the coefficients of the intercept and X. So the regression equation becomes:
y = 13.251 + 0.803x

b. The t score for the slope is t = b₁/s_b₁.
From part a, we know that b₁ = 0.803.
s_b₁ is given in the computer output as the standard error of x = 0.385.
Therefore, t = 0.803/0.385 = 2.086.
Looking at the t distribution table for n-2=6 and α/2=0.025, we find a critical t value of 2.447. Since the t score of 2.086 is less than 2.447, we do not reject the null hypothesis that there is no linear relationship.

c. r² = SSR/SST. But SSR was conveniently removed from the computer output. We need to calculate it from SSR = SST-SSE = 71.875-41.674 = 30.201.
Therefore, r² = 30.201/71.875 = 0.42.

Question 3 (Chapter 9): A sample of 81 account balances of a credit company showed an average balance of $1,200 with a standard deviation of $126.
a. Formulate the hypotheses that can be used to determine whether the mean of all account balances is significantly different from $1,150.
b. Let α = .05. Using the critical value approach what is your conclusion?

Answer:
a. Since we want to know if the mean is "significantly different" from $1,150, the null hypothesis is that it is $1,150.
H₀: μ = 1150
H₁: μ ≠ 1150

b. Since we don't have the population standard deviation, use the t test statistic.
t = (xbar-μ₀)/(s/√n)
= (1200-1150)/(126/√81)
= 50/14
= 3.57
The critical value for t for 80 degrees of freedom and &alpha/2=0.025 is 1.990.
Since the t-value=3.57 is greater than the critical value of 1.990, we reject H₀ and conclude that the mean is significantly different from $1,150.

Question 4 (Chapter 8): A statistician selected a sample of 16 accounts receivable and determined the mean of the sample to be $5,000 with a sample standard deviation of $400. He reported that the sample information indicated the mean of the population ranges from $4,739.80 to $5,260.20. He neglected to report what confidence level (1-a) he had used. Based on the above information, determine the confidence level that was used.

Answer: The statistician is reporting a confidence interval of 5000 ± 260.20. He only mentions the sample standard deviation (not the population std dev), so he must be using the t-distribution and the formula: xbar ± t_{n-1, α/2}(s/√n).

So we have:
260.2 = t(s/√n)
260.2 = t (400/√16)
260.2 = 100t
t = 2.602

We look to the t distribution table and find that t_{15, α/2} = 2.602 is true for α/2 = 0.01. So α = 0.02 and the confidence level is 1-0.02 = 0.98 = 98%.

Question 5 (Chapter 12): The director of graduate studies at a college of business would like to predict the grade point index (GPI) of students in an MBA program based on their GMAT scores. A sample of 20 students is selected. The result of the regression is summarized in the following Minitab output.
Regression Analysis: GPI versus GMAT

The regression equation is
GPI = 0.300 + 0.00487 GMAT

Predictor         Coef         SE Coef         T
Constant        0.3003          0.3616      0.83
GMAT         0.0048702           [ N ]     [ M ]

S = 0.155870 R-Sq = 79.8%

Analysis of Variance

Source             DF         SS         MS         F         P
Regression          1     1.7257     1.7257     71.03     0.000
Residual Error     18     0.4373     0.0243
Total              19     2.1631
a) Given that Σ(X_i-xbar)² = 72757.2 , where X = GMAT, compute N.
b) Compute M and interpret the result. In particular do we reject the underlying hypothesis (which hypothesis) or not?

Answer:
a. N is what we usually call the standard error of the slope, s_b₁. (This is the hardest part of the problem - figuring out what's missing in the Minitab output.) From the formula sheet, we know:
s_b₁ = S_XY/√SSX

We're given SSX, but we need to calculate S_XY from the formula:
S_XY = √(SSE/(n-2)).

We have SSE from the output: SSE = 0.4373. So,
S_XY = √(0.4373/18) = 0.156

Therefore,
s_b₁ = 0.156/√72757.2 = 0.156/269.7 = 0.00058

b. M is the t-score for the slope which is given by:
t = b₁/s_b₁
= 0.0048702/0.00058
= 8.4

The critical value for t for 18 degrees of freedom and α/2=0.005 is 2.878. Therefore, since our t-score is greater than the critical t-value, we would reject the null hypothesis, H₀: μ=0.

GSB420 - Business Statistics

Wednesday, March 12, 2008

Final Exam Study Guide - Practice Questions - Part 2

No comments:

Feedback? Questions?

Subscribe to this Blog

Tags

Blog Archive

Web Resources

My Profile