The bootstrap approach can be used to assess the variability of the coefficient estimates and predictions from a statistical learning method. Here we use the bootstrap approach in order to assess the variability of the estimates for \(\hat\beta_0\) and \(\hat\beta_1\), the intercept and slope terms for the linear regression model that uses horsepower to predict mpg in the Auto data set. We will compare the estimates obtained using the bootstrap to those obtained using the formulas for \(SE(\hat\beta_0)\) and \(SE(\hat\beta_1)\) described in Section 3.1.2. We first create a simple function, boot.fn(), which takes in the Auto data set as well as a set of indices for the observations, and returns the intercept and slope estimates for the linear regression model. We then apply this function to the full set of 392 observations in order to compute the estimates of \(\hat\beta_0\) and \(\hat\beta_1\) on the entire data set using the usual linear regression coefficient estimate formulas from Chapter 3. Note that we do not need the { and } at the beginning and end of the function because it is only one line long.

> boot.fn <- function(data, index) 
+    return(coef(lm(mpg ~ horsepower, data = data, subset = index)))
> 
> boot.fn(Auto, 1:392)
(Intercept)  horsepower 
 39.9358610  -0.1578447 

The boot.fn() function can also be used in order to create bootstrap estimates for the intercept and slope terms by randomly sampling from among the observations with replacement. Here we give two examples.

> set.seed(1)
> boot.fn(Auto, sample(392, 392, replace = T))
(Intercept)  horsepower 
 40.3404517  -0.1634868 
> boot.fn(Auto, sample(392, 392, replace = T))
(Intercept)  horsepower 
 40.1186906  -0.1577063 

Try to estimate the coefficients of the model below by randomly sampling among the observations with replacement and store the output in boot.coef:


Assume that: