We can also use the regsubsets()
function to perform forward stepwise
or backward stepwise selection, using the argument method="forward"
or
method="backward"
.
regfit.fwd <- regsubsets(Salary ~ ., data = Hitters, nvmax = 19, method = "forward")
summary(regfit.fwd)
regfit.bwd <- regsubsets(Salary ~ ., data = Hitters, nvmax = 19, method = "backward")
summary(regfit.bwd)
For instance, we see that using forward stepwise selection, the best one variable
model contains only CRBI
, and the best two-variable model additionally
includes Hits
. For this data, the best one-variable through six variable
models are each identical for best subset and forward selection.
However, the best seven-variable models identified by forward stepwise selection,
backward stepwise selection, and best subset selection are different.
> coef(regfit.full, 7)
(Intercept) Hits Walks CAtBat
79.4509472 1.2833513 3.2274264 -0.3752350
CHits CHmRun DivisionW PutOuts
1.4957073 1.4420538 -129.9866432 0.2366813
> coef(regfit.fwd, 7)
(Intercept) AtBat Hits Walks
109.7873062 -1.9588851 7.4498772 4.9131401
CRBI CWalks DivisionW PutOuts
0.8537622 -0.3053070 -127.1223928 0.2533404
> coef(regfit.bwd, 7)
(Intercept) AtBat Hits Walks
105.6487488 -1.9762838 6.7574914 6.0558691
CRuns CWalks DivisionW PutOuts
1.1293095 -0.7163346 -116.1692169 0.3028847
Try applying forward and backward selection to the Boston dataset with medv
as the response.
Store the output in regfit.fwd
and regfit.bwd
respectively. Set the nvmax
parameter to 13.
Assume that:
MASS
and leaps
libraries have been loadedBoston
dataset has been loaded and attached