The e1071
library includes a built-in function, tune()
, to perform cross-validation.
By default, tune()
performs ten-fold cross-validation on a set
of models of interest. In order to use this function, we pass in relevant
information about the set of models that are under consideration. The
following command indicates that we want to compare SVMs with a linear
kernel, using a range of values of the cost
parameter.
set.seed(1)
tune.out <- tune(svm, y ~ ., data = dat, kernel = "linear", ranges = list(cost = c(0.001, 0.01, 0.1, 1, 5, 10, 100)))
We can easily access the cross-validation errors for each of these models
using the summary()
command:
summary(tune.out)
Parameter tuning of ‘svm’:
- sampling method: 10-fold cross validation
- best parameters:
cost
0.1
- best performance: 0.05
- Detailed performance results:
cost error dispersion
1 1e-03 0.55 0.4377975
2 1e-02 0.55 0.4377975
3 1e-01 0.05 0.1581139
4 1e+00 0.15 0.2415229
5 5e+00 0.15 0.2415229
6 1e+01 0.15 0.2415229
7 1e+02 0.15 0.2415229
We see that cost=0.1
results in the lowest cross-validation error rate. The
tune()
function stores the best model obtained, which can be accessed as
follows:
bestmod <- tune.out$best.model
summary(bestmod)
cost
parameter.
tune.out
, the parameters of the best model and its performance in best.parameters
and best.performance
, respectively.Hint: You can find the performance and parameters of the best model in the attributes of
tune.out
The seed is set on top of the exercise, do not change the seed value or add another seed
Assume that:
e1071
library has been loaded