Questions

Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.

Repeat question 4 using naive bayes. Perform the naive bayes model on the training data in order to predict crime01 using the variables nox, rad, and dis. What is the test error (NOT accuracy) of the model obtained on the scaled test data? Store the predictions in nb.pred and the test error in nb.error.
Repeat question 4 using KNN. Perform KNN on the training data in order to predict crime01 using the variables nox, rad, and dis. What is the test error (NOT accuracy) of the model obtained on the scaled test data? Use \(K = 1\) and store the predictions in knn.pred1 and the test error in knn.error1. Set a seed value of 1.

Recall that the KNN function knn() requires four inputs: train.X, test.X, train.crime01, and the value for \(K\). Again, first scale the independent variables with scale() before computing train.X and test.X.
Now, use \(K = 10\) and store the predictions in knn.pred10 and the test error in knn.error10. Set a seed value of 1.
Inspect the results of both models.
- MC2:
  Which of the KNN models performs best on the test data?:
  - 1: \(K = 1\)
  - 2: \(K = 10\)

Assume that:

The MASS, e1071, and class libraries have been loaded
The Boston dataset has been loaded and attached
The variable crime01 is created and brought together with the Boston data set in data. The dataset has been split in train data.train and test data.test, and the test observations for the dependent variable crime01 are in crime01.test.