The bootstrap approach can be used to assess the variability of the coefficient
estimates and predictions from a statistical learning method. Here
we use the bootstrap approach in order to assess the variability of the
estimates for \(\hat\beta_0\) and \(\hat\beta_1\),
the intercept and slope terms for the linear regression model
that uses horsepower
to predict mpg
in the Auto
data set. We
will compare the estimates obtained using the bootstrap to those obtained
using the formulas for \(SE(\hat\beta_0)\) and \(SE(\hat\beta_1)\) described in Section 3.1.2.
We first create a simple function, boot.fn()
, which takes in the Auto data
set as well as a set of indices for the observations, and returns the intercept
and slope estimates for the linear regression model. We then apply this
function to the full set of 392 observations in order to compute the estimates of
\(\hat\beta_0\) and \(\hat\beta_1\) on the entire data set using the usual linear regression
coefficient estimate formulas from Chapter 3. Note that we do not need the
{ and } at the beginning and end of the function because it is only one line
long.
> boot.fn <- function(data, index)
+ return(coef(lm(mpg ~ horsepower, data = data, subset = index)))
>
> boot.fn(Auto, 1:392)
(Intercept) horsepower
39.9358610 -0.1578447
The boot.fn()
function can also be used in order to create bootstrap estimates
for the intercept and slope terms by randomly sampling from among
the observations with replacement. Here we give two examples.
> set.seed(1)
> boot.fn(Auto, sample(392, 392, replace = T))
(Intercept) horsepower
40.3404517 -0.1634868
> boot.fn(Auto, sample(392, 392, replace = T))
(Intercept) horsepower
40.1186906 -0.1577063
Try to estimate the coefficients of the model below by randomly sampling among the observations with replacement and store the output in boot.coef
:
Assume that:
ISLR2
and boot
libraries have been loadedAuto
dataset has been loaded and attached