Monday, February 25, 2008

The Least-Squares Method

The Method of Least Squares
As described in the previous post, the least-squares method minimizes the sum of the squares of the error between the y-values estimated by the model and the observed y-values.

In mathematical terms, we need to minimize the following:
∑ (yi - (β01xi))

All the yi and xi are known and constant, so this can be looked at as a function of β0 and β1. We need to find the β0 and β1 that minimize the total sum.

From calculus we remember that to minimize a function, we take the derivative of the function, set it to zero and solve. Since this is a function of two variables, we take two derivatives - the partial derivative with respect to β0 and the partial derivative with respect to β1.

Don't worry! We won't need to do any of this in practice - it's all been done years ago and the generalized solutions are well know.

To find b0 and b1:
1. Calculate xbar and ybar, the mean values for x and y.
2. Calculate the difference between each x and xbar. Call it xdiff.
3. Calculate the difference between each y and ybar. Call it ydiff.
4. b1 = [∑(xdiff)(ydiff)] / ∑(xdiff2)
5. b0 = ybar - b1xbar

Notice that we switched from using β to using b? That's because β is used for the regression coefficients of the actual linear relationship. b is used to represent our estimate of the coefficients determined by the least squares method. We may or may not be correctly estimating β with our b. We can only hope!

No comments: