We will now perform KNN using the knn() function, which is part of the class library. This function works rather differently from the other model-fitting functions that we have encountered thus far. Rather than a two-step approach in which we first fit the model and then we use the model to make predictions, knn() forms predictions using a single command. The function requires four inputs.

  1. A matrix containing the predictors associated with the training data, labeled train.X below.
  2. A matrix containing the predictors associated with the data for which we wish to make predictions, labeled test.X below.
  3. A vector containing the class labels for the training observations, labeled train.Direction below.
  4. A value for K, the number of nearest neighbors to be used by the classifier

The cbind() function, short for column bind, binds the columns of two variables together. Here, we create a two 2x2 matrices and append their columns to each other.

x <- matrix(1:4, 2, 2)
x
     [,1] [,2]
[1,]    1    3
[2,]    2    4

y <- matrix(5:6, 2, 1)
y
     [,1]
[1,]    5
[2,]    6

cbind(x, y)
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

Note that it does not have to be two matrices, we can also append a vector to a matrix:

cbind(x, c(5, 6))
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

We use the cbind() function, short for column bind, to bind the Lag1 and Lag2 variables together into two matrices, one for the training set and the other for the test set.

> library(class)
> train.X <- cbind(Lag1, Lag2)[train,]
> test.X <- cbind(Lag1, Lag2)[!train,]
> train.Direction <- Direction[train]

The counterpart of cbind() is rbind(), short for row bind, which binds the rows of two variables together. Try using the rbind() function on the matrices x and y to see the result: