In this exercise, we will predict the number of applications (Apps) received using the other variables in the College data set.

Questions

  1. Split the data set into a training and a test set. We already provide you with this code below. (At this point, this part of the code should be ‘too easy’ for you, so we save you the hassle of typing.)

  2. Fit a linear model using least squares on the training set, and store the test error (MSE) in lm.error.

  3. Fit a ridge regression model on the training set, with \(\lambda\) chosen by cross-validation. Use the following grid: grid <- 10 ^ seq(4, -2, length = 100). Use the default settings for the functions unless stated otherwise (such as the grid, see previous sentence). Store the test error in ridge.error.

  4. Fit a lasso model on the training set, with \(\lambda\) chosen by cross-validation. Store the test error in lasso.error and the non-zero coefficient estimates in coef.lasso.


Assume that: