Regression and Correlation – Regression
Here we’ll look at the main point of regression and correlation; once we find out if there is a relationship, we
want to come up with an equation for that relationship, and make predictions for that relationship. We’ll be
looking at several components. Just as in any linear equation, we need to know the slope and y-intercept; that
will also be the case for our regression line.
Remember that your usual form for the linear equation is y  mx  b , where m is the slope, and b is the y-
intercept. There is a formula for the slope, and we can find the y-intercept in several different ways.
The equation of the regression line is y  b0  b1 x , where b1 is the slope and b0 is the y-intercept.
                                                            n   xy     x   y 
The formula for the slope of the regression line is b1                                             .
                                                               n   x2     x 
                                                                                            2
The formula for the y-intercept of the regression line is b0  y  b1 x .
y is the average of the y-values and x is the average of the x-values.
Round the slope and y-intercept to 3 decimal places.
Let’s find the regression equation for the following ordered pairs: (1, 2), (2, 3), (3, 4). These were used in the
second example for correlation. We can use the summary table from that example as well.
                                           x           y           xy              x2               y2
                                           1           2            2               1                4
                                           2           3            6               4                9
                                           3           4           12               9               16
                              Totals       6           9           20              14               29
                                         x            y        xy            x      2
                                                                                                    y   2
And now we can find the slope, then the intercept, then the equation.
                                         n   xy     x   y         3  20    6  9         6
                          Slope = b1                                                                    1
                                            n  x    x                 3 14    6 
                                                               2                                2
                                                   2                                                     6
                                              23 4             1 2  3
                                         y           3 and x           2
                                                3                   3
                                   y-intercept = b0  y  b1 x  3  1 2   3  2  1
Using all these values, we now have enough for the equation. The general form of the regression line is
y  b0  b1 x , so our equation will be y  1  1x .
Regression and Correlation – Regression
What if I came up with this equation, and then I wanted to know the best predicted value for y if x = 10? That’s
simple. You just take the x-value and put it in the equation. The best predicted y-value is given by putting 10
into the equation; y  1  110   1  10  11 .
Let’s try this again, with the data from the first example in the correlation section. Suppose we have the ordered
pairs (1, 2), (1, 8), (3, 6), (5, 4). I want to know the equation for the regression line, and I also want to know the
predicted y-value when x = 20. I can use the table that we constructed in the previous section.
                                            x          y              xy           x2               y2
                                            1           2              2           1                  4
                                            1           8              8           1                 64
                                            3           6             18           9                 36
                                            5           4             20          25                 16
                             Totals        10          20             48          36                120
                                          x          y           xy          x      2
                                                                                                y        2
And now we can find the slope, then the intercept, then the equation.
                                    n   xy     x   y        4  48  10  20          8
                     Slope = b1                                                                      0.182
                                       n  x    x                 4  36   10 
                                                            2                               2
                                                2                                                   44
                                         286 4             11 3  5
                                    y             5 and x              2.5
                                            4                    4
                         y-intercept = b0  y  b1 x  5   0.182  2.5  5  0.455  5.455
Using all these values, we now have enough for the equation. The general form of the regression line is
y  b0  b1 x , so our equation will be y  5.455  0.182 x .
Now, what if x = 10? That’s simple. You just take the x-value and put it in the equation. The best predicted y-
value is given by putting 20 into the equation; y  5.455  0.182  20   5.455  3.64  1.815 .
The only problem with this last example is that the relationship was weak to begin with, so the 1.185 doesn’t
really do anything for us.