12.
9 Sequential Methods for Model Selection 479
Stepwise Regression
One standard procedure for searching for the “optimum subset” of variables in the
absence of orthogonality is a technique called stepwise regression. It is based
on the procedure of sequentially introducing the variables into the model one at
a time. Given a predetermined size α, the description of the stepwise routine
will be better understood if the methods of forward selection and backward
elimination are described first.
Forward selection is based on the notion that variables should be inserted
one at a time until a satisfactory regression equation is found. The procedure is as
follows:
STEP 1. Choose the variable that gives the largest regression sum of squares
when performing a simple linear regression with y or, equivalently, that which
gives the largest value of R2 . We shall call this initial variable x1 . If x1 is
insignificant, the procedure is terminated.
STEP 2. Choose the variable that, when inserted in the model, gives the
largest increase in R2 , in the presence of x1 , over the R2 found in step 1.
This, of course, is the variable xj for which
R(βj |β1 ) = R(β1 , βj ) − R(β1 )
is largest. Let us call this variable x2 . The regression model with x1 and
x2 is then fitted and R2 observed. If x2 is insignificant, the procedure is
terminated.
STEP 3. Choose the variable xj that gives the largest value of
R(βj | β1 , β2 ) = R(β1 , β2 , βj ) − R(β1 , β2 ),
again resulting in the largest increase of R2 over that given in step 2. Calling
this variable x3 , we now have a regression model involving x1 , x2 , and x3 . If
x3 is insignificant, the procedure is terminated.
This process is continued until the most recent variable inserted fails to induce a
significant increase in the explained regression. Such an increase can be determined
at each step by using the appropriate partial F -test or t-test. For example, in step
2 the value
R(β2 |β1 )
f=
s2
can be determined to test the appropriateness of x2 in the model. Here the value
of s2 is the mean square error for the model containing the variables x1 and x2 .
Similarly, in step 3 the ratio
R(β3 | β1 , β2 )
f=
s2
tests the appropriateness of x3 in the model. Now, however, the value for s2 is the
mean square error for the model that contains the three variables x1 , x2 , and x3 .
If f < fα (1, n − 3) at step 2, for a prechosen significance level, x2 is not included
480 Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models
and the process is terminated, resulting in a simple linear equation relating y and
x1 . However, if f > fα (1, n − 3), we proceed to step 3. Again, if f < fα (1, n − 4)
at step 3, x3 is not included and the process is terminated with the appropriate
regression equation containing the variables x1 and x2 .
Backward elimination involves the same concepts as forward selection except
that one begins with all the variables in the model. Suppose, for example, that
there are five variables under consideration. The steps are as follows:
STEP 1. Fit a regression equation with all five variables included in the
model. Choose the variable that gives the smallest value of the regression
sum of squares adjusted for the others. Suppose that this variable is x2 .
Remove x2 from the model if
R(β2 | β1 , β3 , β4 , β5 )
f=
s2
is insignificant.
STEP 2. Fit a regression equation using the remaining variables x1 , x3 , x4 ,
and x5 , and repeat step 1. Suppose that variable x5 is chosen this time. Once
again, if
R(β5 | β1 , β3 , β4 )
f=
s2
is insignificant, the variable x5 is removed from the model. At each step, the
s2 used in the F-test is the mean square error for the regression model at that
stage.
This process is repeated until at some step the variable with the smallest ad-
justed regression sum of squares results in a significant f-value for some predeter-
mined significance level.
Stepwise regression is accomplished with a slight but important modification
of the forward selection procedure. The modification involves further testing at
each stage to ensure the continued effectiveness of variables that had been inserted
into the model at an earlier stage. This represents an improvement over forward
selection, since it is quite possible that a variable entering the regression equation
at an early stage might have been rendered unimportant or redundant because
of relationships that exist between it and other variables entering at later stages.
Therefore, at a stage in which a new variable has been entered into the regression
equation through a significant increase in R2 as determined by the F-test, all the
variables already in the model are subjected to F-tests (or, equivalently, to t-tests)
in light of this new variable and are deleted if they do not display a significant
f-value. The procedure is continued until a stage is reached where no additional
variables can be inserted or deleted. We illustrate the stepwise procedure in the
following example.
Example 12.11: Using the techniques of stepwise regression, find an appropriate linear regression
model for predicting the length of infants for the data of Table 12.8.
Solution : STEP 1. Considering each variable separately, four individual simple linear
regression equations are fitted. The following pertinent regression sums of