Let's derive least squares regression because I'm rusty.
Our model for the data is a linear equation with two parameters, #alpha and beta#.
#hat{y} = alpha x + beta #
Our total error is the sum of the squared residuals for each data point.
# E=sum_{i=1}^n ( y_i - hat{y_i})^2 =sum (y_i - alpha x_i - beta )^2 #
To control clutter I'll just write #sum# for #sum_{i=1}^n.#
We minimize #E# by setting the partials to zero:
#0 = {partial E}/{partial alpha} = sum -x_i(y_i - alpha x_i - beta ) #
# sum x_i y_i = alpha sum x_i^2 + beta sum x_i #
#0 = {partial E}/{partial beta } = sum -(y_i - alpha x_i - beta ) #
# sum y_i = alpha sum x_i + n beta #
That last one comes from #sum_{i=1}^n beta = beta .#
We have two equations in two unknowns. I remember #MA=S# has solutions #A=M^{-1} S# and for a two by two matrix #M=(a, b, quadquad c, d)# and #S=(s,t)^T# we get
#M^{-1}S = 1/{ad-bc}(ds -bt, -cs+at )#
Back to the problem. Let's declutter even more and write
#sum x_i = n bar{x}, sum x_i y_i = n bar{xy}, quad sum x_i ^2 = n bar{x^2},# etc. We rewrite our system, cancelling the #n#s:
# bar{xy} = alpha bar{x^2} + beta bar{x} #
#bar{y} = alpha bar{x} + beta #
Applying our solution, we substitute into our solution
#a=bar{x^2}, b=c=bar{x}, d=1, s=bar{xy}, t=bar{y}#
giving
# alpha = { bar{xy} - bar{x} \ bar{y}}/{ bar{x^2} - bar{x}^2 }
#
#beta = { - bar{x} \ bar{xy} + bar{x^2} bar{y}}/{ bar{x^2} - bar{x}^2 }
#
I don't know if that's right, but it's giving me flashbacks. Let's try our numbers.
#x\ y\ \ x^2\ \ y^2\ \ xy #
#1 \ 8\ \ \ 1 \ \ \ \ 64\ \ \ 8#
#2\ 7\ \ \ 4 \ \ \ \ 49\ 14 #
#3\ 5\ \ \ 9\ \ \ \ 25 \ 15#
#6\ 20\ 14\ 138\ 37# TOTALS
#alpha = { 37/3 -(6/3)(20/3) } /{ (14/3) -(6/3)^2 } = -3/2 #
# beta = { - (6/3) (37/3) + (14/3)(20/3) }/{ (14/3) -(6/3)^2 } = 29/3 #
Model:
#hat{y} = -3/2 x + 29/3#
Check:
Let's calculate the squared error:
# ( 8 - (-3/2(1) + (29/3) ) )^2 + ( 7 - (-3/2(2) + (29/3) ) )^2 + ( 5 - (-3/2(3) + (29/3) ) )^2 = 1/6 #
The theory says if we change #a# or #b# by a little bit (or a lot) we'll always get a bigger error. Let's pop a few into the computer:
# ( 8 - (- 1.501 (1) + (29/3)) )^2 + ( 7 - (-1.501(2) + (29/3)) )^2 + ( 5 - (-1.501(3) + (29/3)) )^2 approx 0.16668067 #
# ( 8 - (-3/2(1) + 10 ) )^2 + ( 7 - (-3/2(2) + 10) )^2 + ( 5 - (-3/2(3) + 10 ) )^2 = 1/2#
Let's call that checked.