Questions

Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.

  1. Use the bs() function to fit a regression spline to predict nox using dis. Place knots at the values 4, 7, and 11. Store the model in fit.bs.

    Try to replicate the following figure:

    1. Create a scatterplot of nox vs dis using all the data.
    2. Create a sequence dis.grid of values ranging from the lowest dis value in the data to the highest dis value observed, in steps of 0.1.
    3. Using the model fit.bs, predict nox for the entire sequence. Store the result in preds.
    4. Add the predictions preds on the plot.

    plot

  2. Now fit a regression spline for a range of degrees of freedom (3 to 16), and report the resulting RSS in a vector rss. Don’t overwrite your solution fit.bs from question 4.
    1. Store the 14 RSSs in the variable rss.

      Initialize the vector rss as a vector of length 16 with values NA. If you would like to plot the RSS, exclude the first 2 NA values with rss[-c(1, 2)].

  3. Perform 10-fold cross-validation in order to select the best degrees of freedom on this data.
    1. Set a seed of 1. Store the 14 test errors in the variable deltas.bs. Don’t overwrite your solution fit.bs from question 4. You can ignore the warnings.

      Initialize the vector deltas.bs as a vector of length 16 with values NA. If you would like to plot the CV test error, exclude the first 2 NA values with deltas.bs[-c(1, 2)].

    2. What is the optimal degrees of freedom? Store the answer in df.min.bs.

Assume that: