In this exercise, we fit a neural network on the Default data, available in the ISLR2 library. Have a look at labs 10.9.1-10.9.2 for guidance. Compare the classification performance of the deep NN model with that of logistic regression.

Questions

Data preprocessing:

  1. Convert the dependent variable default from character (“yes” & “no”) to numeric (1 & 0).
  2. Split the dataset in train & test set with 60/40 distribution. Set a seed of 42. Store the indices of the training set in train.idx.
  3. Create a matrix x that can be processed by a NN. Use the model.matrix() function and scale the input variables. Use income and balance as independent variables. Also store the dependent variable default in a vector y.

Model building:

  1. Create a neural network nn.model with 2 hidden layers of 10 hidden units each, and a relu activation function.
  2. Add a dropout layer after each hidden layer, with a dropout rate of 30%.
  3. Add an output layer with just one unit and the sigmoid activation function.

Model compiling:

  1. Compile the modnn model from the previous exercise as follows:
    • Use the binary_crossentropy as the loss function
    • Use the optimizer_rmsprop() as you optimizer
    • Also track the tf$keras$metrics$AUC() as a metric.

Model fitting:

  1. Fit the neural network on the training data for 30 epochs and a batch size of 32. Use 40% of the training data for validation. Store the result in history.
  2. Visualize the learning curve.

Model evaluation:

  1. Store the predictions on the test data in nn.pred.
  2. Compute the AUC of the neural network on the test data. Use the function pROC::auc(pROC::roc(drop(response), as.numeric(drop(predictor)))), with response and predictor obtained in the previous steps.

Logistic regression:

  1. Now, do the same procedure for logistic regression. Store the model in lr.model and the predictions in lr.pred.