Below, we discuss a situation in which the two numbers differ. Our cross-validation estimate for the test error is approximately 24.23. We can repeat this procedure for increasingly complex polynomial fits. To automate the process, we use the for() function to initiate a for loop which iteratively fits polynomial regressions for polynomials of order \(i = 1\) to \(i = 5\), computes the associated cross-validation error, and stores it in the \(i\)th element of the vector cv.error. We begin by initializing the vector. This command will likely take a couple of minutes to run.

> cv.error <- rep(0,5)
> for (i in 1:5) {
+  glm.fit <- glm(mpg ~ poly(horsepower, i), data = Auto)
+  cv.error[i] <- cv.glm(Auto, glm.fit)$delta[1]
+ }
> 
> cv.error
[1] 24.23151 19.24821 19.33498 19.42443 19.03321

We see a sharp drop in the estimated test \(MSE\) between the linear and quadratic fits, but then no clear improvement from using higher-order polynomials.

Try to perform the same steps but with weight as the predictor instead of horsepower:


Assume that: