The LOOCV estimate can be automatically computed for any generalized
linear model using the glm()
and cv.glm()
functions. In the lab for Chapter 4,
we used the glm()
function to perform logistic regression by passing
in the family="binomial"
argument. But if we use glm()
to fit a model
without passing in the family
argument, then it performs linear regression,
just like the lm()
function. So for instance,
> glm.fit <- glm(mpg ~ horsepower, data = Auto)
> coef(glm.fit)
(Intercept) horsepower
39.9358610 -0.1578447
and
> lm.fit <- lm(mpg ~ horsepower, data = Auto)
> coef(lm.fit)
(Intercept) horsepower
39.9358610 -0.1578447
yield identical linear regression models. In this lab, we will perform linear
regression using the glm()
function rather than the lm()
function because
the latter can be used together with cv.glm()
. The cv.glm()
function is
part of the boot
library.
> library(boot)
> glm.fit <- glm(mpg ~ horsepower, data = Auto)
> cv.err <- cv.glm(Auto, glm.fit)
> cv.err$delta
[1] 24.23151 24.23114
The cv.glm()
function produces a list with several components. The two
numbers in the delta
vector contain the cross-validation results. In this
case the numbers are identical (up to two decimal places) and correspond
to this formula:
Try performing the same steps of LOOCV for the following model and store the delta vector in cv.delta2
: