Friday, February 29, 2008

Lecture 8 - Residual Analysis - Definition

Residual Analysis
In Lecture 7 we discussed how to use the method of least-square to perform simple linear regression on a set of data. We also discussed the four assumptions we make about our data in order to use the method of least-squares for the regression:
1. Linearity
2. Independence of errors
3. Normality of error
4. Equal variance of errors

The error is also known as the residual and is the difference between the observed Yi value, for any particular Xi, and the value for Yi predicted by our regression model which is usually symbolized by Ŷi (read "y hat sub i"). The residual is symbolized by the greek letter epsilon (lower case) - εi.

εi = Yi - Ŷi

We perform a four-part residual analysis on our data to evaluate whether each of the four assumptions hold and, based on the outcome, we can determine whether our linear regression model is the correct model.

It's called a residual analysis because 3 of the 4 assumptions (independence, normality and equality of variance) directly relate to the errors (the residuals) and the other assumption (linearity) is tested by assessing the residuals.

No comments: