In this exercise, we fit a neural network on the Default data, available in the ISLR2 library. Have a look at labs 10.9.1-10.9.2 for guidance. Compare the classification performance of the deep NN model with that of logistic regression.

Questions

Data preprocessing:

Convert the dependent variable default from character (“yes” & “no”) to numeric (1 & 0).
Split the dataset in train & test set with 60/40 distribution. Set a seed of 42. Store the indices of the training set in train.idx.
Create a matrix x that can be processed by a NN. Use the model.matrix() function and scale the input variables. Use income and balance as independent variables. Also store the dependent variable default in a vector y.

Model building:

Create a neural network nn.model with 2 hidden layers of 10 hidden units each, and a relu activation function.
Add a dropout layer after each hidden layer, with a dropout rate of 30%.
Add an output layer with just one unit and the sigmoid activation function.

Model compiling:

Compile the modnn model from the previous exercise as follows:
- Use the binary_crossentropy as the loss function
- Use the optimizer_rmsprop() as you optimizer
- Also track the tf$keras$metrics$AUC() as a metric.

Model fitting:

Fit the neural network on the training data for 30 epochs and a batch size of 32. Use 40% of the training data for validation. Store the result in history.
Visualize the learning curve.

Model evaluation:

Store the predictions on the test data in nn.pred.
Compute the AUC of the neural network on the test data. Use the function pROC::auc(pROC::roc(drop(response), as.numeric(drop(predictor)))), with response and predictor obtained in the previous steps.

Logistic regression:

Now, do the same procedure for logistic regression. Store the model in lr.model and the predictions in lr.pred.