In this problem, you will use support vector approaches in order to predict
whether a given car gets high or low gas mileage based on the Auto
data set.
Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median.
Store the binary variable in the Auto dataframe with the name mpglevel
.
Make sure that mpglevel
is encoded as a factor.
After you have added mpglevel
, drop the mpg
attribute by Auto$mpg <- NULL
.
Fit a support vector classifier to the data with various values of cost
(0.01, 0.1, 1) with the tune()
function,
in order to predict whether a car gets high or low gas mileage. Use all the other variables as predictors.
Store the outcome of the cross-validation in tune.out.lin
,
the parameters of the best model and its performance in best.param.lin
and best.perform.lin
, respectively.
Make sure to set.seed(1)
before running the cross-validation.
Now repeat 2, this time using SVMs with a radial basis kernel, with different values of
gamma
(0.01, 0.1, 1) and cost
(0.01, 0.1, 1).
Store the outcome of the cross-validation in tune.out.rad
,
the parameters of the best model and its performance in best.param.rad
and best.perform.rad
, respectively.
Make sure to set.seed(1)
before running the cross-validation.
Now repeat 2, this time using SVMs with a polynomial basis kernel, with different values of
degree
(2, 3, 4) and cost
(0.01, 0.1, 1).
Store the outcome of the cross-validation in tune.out.poly
,
the parameters of the best model and its performance in best.param.poly
and best.perform.poly
, respectively.
Make sure to set.seed(1)
before running the cross-validation.
MC1: Which one of the models performed best.
Assume that:
ISLR2
and e1071
libraries have been loadedAuto
dataset has been loaded and attached