Using the Boston data set,
you will develop a model to predict whether a given suburb has a crime rate above or below the median.

Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.
Do the following preprocessing steps:
Create a binary variable, crime01, that contains a 1 if crime contains
a value above its median, and a 0 if crime contains a value below
its median. You can compute the median using the median()
function.
Use the data.frame() function to create a single data set containing both crime01 and
the other Boston variables. Add crime01 as the last column in the new dataset. Store the result in the variable data.
Explore the data graphically in order to investigate the association between crime01 and the other features.
Which of the other features seem most likely to be useful in predicting crime01?
For example, you can make pairwise scatterplots with pairs().
Do a train-test split:
sample() function.
Take 70% of the data (354 rows) in the training set and the other 30% in the test set. Use a seed value of 1.
Store the indices of the training set in train.data.test that only contains the test observations (dependent + independent variables).crime01.test that only contains the test observations.
Perform logistic regression on the training data in order to predict crime01
using the variables nox, rad, and dis.
What is the test error (NOT accuracy) of the model obtained on the test data?
Store the model in glm.fit, the predictions in glm.pred and the test error in glm.error.
Use a threshold of 0.5 to classify predicted probabilities as 0 or 1 (numeric vector).
Assume that:
MASS library has been loadedBoston dataset has been loaded and attached