This question uses Boston dataset, available in the MASS library. We will use the the variables dis (the weighted mean of distances to five Boston employment centers) and nox (nitrogen oxides concentration in parts per 10 million). We will treat dis as the predictor and nox as the response.

Questions

Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.

  1. Use the poly() function to fit a cubic polynomial regression to predict nox using dis. Store the model in fit.poly.

    • MC1:
      Interpret the regression output. Which polynomial terms are significant?
      • 1: Only order-1
      • 2: Only order-1 and order-2
      • 3: Order-1, order-2, and order-3

  2. Plot the polynomial fits for a range of different polynomial degrees (from 1 to 10), and report the associated residual sum of squares in a vector rss.
    1. Write a for loop of 10 iterations. In each iteration i, fit an order-i polynomial of dis to predict nox.
    2. Store the RSS in the i-th element of the vector rss. Don’t forget to initialize the vector.

      Hint: you can find the residuals as an attribute of the model.

    3. Plot the RSS for the varying degree polynomials

  3. Perform 10-fold cross-validation to select the optimal degree for the polynomial \(d\).
    1. Set a seed of 1. Store the 10 test errors in the variable deltas.poly.
    2. Plot the CV test errors for the varying degree polynomials.
    3. What is the optimal degree polynomial? Store the answer in d.min.poly.

Assume that: