Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.
Repeat question 4 using QDA. Perform QDA on the training data in order to predict mpg01
using the variables cylinders
, weight
, displacement
and horsepower
.
What is the accuracy of the model obtained on the test data?
Store the model in qda.fit
, the predictions in qda.pred
and the accuracy in qda.acc
.
Repeat question 4 using logistic regression. Perform logistic regression on the training data in order to predict mpg01
using the variables cylinders
, weight
, displacement
and horsepower
.
What is the accuracy of the model obtained on the test data?
Store the model in glm.fit
, the predictions in glm.pred
and the accuracy in glm.acc
.
Use a threshold of 0.5 to classify predicted probabilities as 0 or 1 (numeric vector).
Repeat question 4 using naive bayes. Perform logistic regression on the training data in order to predict mpg01
using the variables cylinders
, weight
, displacement
and horsepower
.
What is the accuracy of the model obtained on the test data?
Store the model in nb.fit
, the predictions in nb.pred
and the accuracy in nb.acc
.
Repeat question 4 using KNN.
Perform KNN on the training data in order to predict mpg01
using the variables cylinders
, weight
, displacement
and horsepower
.
What is the accuracy of the model obtained on the scaled test data?
Store the predictions in knn.pred
and the accuracy in knn.acc
.
Use \(K = 5\). Set a seed value of 1.
Recall that the KNN function
knn()
requires four inputs:train.X
,test.X
,train.mpg01
, and the value for \(K\). However, this time first scale the independent variables withscale()
before computingtrain.X
andtest.X
. For example, a regular car has 4 cylinders and around 100 horsepower. As a result,horsepower
would otherwise have a much larger effect on the distance between the observations because its scale is much larger.
Assume that:
ISLR2
, MASS
, e1071
, and class
libraries have been loadedAuto
dataset has been loaded and attachedmpg01
is created and brought together with the Auto
data set in data
.
The dataset has been split in train data.train
and test data.test
,
and the test observations for the dependent variable mpg01
are in mpg01.test
.