Least Squares Regression Line (LSRL)

Key Questions

  • Equation for least-squares linear regression:

    y = m x + b

    where
    m = (sum(x_iy_i) - (sum x_i sum y_i)/n)/(sum x_i^2 -((sum x_i)^2)/n)

    and
    b = (sum y_i - m sum x_i)/n

    for a collection of n pairs (x_i,y_i)

    This looks horrible to evaluate (and it is, if you are doing it by hand); but using a computer (with, for example, a spreadsheet with columns :y, x, xy, and x^2) it isn't too bad.

  • The primary use of linear regression is to fit a line to 2 sets of data and determine how much they are related.

    Examples are:

    2 sets of stock prices

    rainfall and crop output

    study hours and grades

    With respect to correlation, the general consensus is:

    Correlation values of 0.8 or higher denote a strong correlation
    Correlation values of 0.5 or higher up to 0.8 denote a weak correlation
    Correlation values less than 0.5 denote a very weak correlation\f

    Linear Regression and Correlation Calculator

  • Answer:

    All this means is the minimum between the sum of the difference between the actual y value and the predicted y value.

    min sum_(i=1)^n(y_i-haty)^2

    Explanation:

    Just means the minimum between the sum of all the resuidals

    min sum_(i=1)^nhatu_i^2

    all this means is the minimum between the sum of the difference between the actual y value and the predicted y value.

    min sum_(i=1)^n(y_i-haty)^2

    This way by minimizing the error between the predicted and error you get the best fit for the regression line.

Questions