Below, we discuss a situation in which the two numbers differ.
Our cross-validation estimate for the test error is approximately 24.23.
We can repeat this procedure for increasingly complex polynomial fits.
To automate the process, we use the for()
function to initiate a for loop
which iteratively fits polynomial regressions for polynomials of order \(i = 1\)
to \(i = 5\), computes the associated cross-validation error, and stores it in
the \(i\)th element of the vector cv.error
. We begin by initializing the vector.
This command will likely take a couple of minutes to run.
> cv.error <- rep(0,5)
> for (i in 1:5) {
+ glm.fit <- glm(mpg ~ poly(horsepower, i), data = Auto)
+ cv.error[i] <- cv.glm(Auto, glm.fit)$delta[1]
+ }
>
> cv.error
[1] 24.23151 19.24821 19.33498 19.42443 19.03321
We see a sharp drop in the estimated test \(MSE\) between the linear and quadratic fits, but then no clear improvement from using higher-order polynomials.
Try to perform the same steps but with weight
as the predictor instead of horsepower
:
Assume that:
ISLR2
and boot
libraries have been loadedAuto
dataset has been loaded and attached