Finally, we perform PLS using the full data set, using \(M = 1\), the number of components identified by cross-validation.

> pls.fit <- plsr(Salary ~ ., data = Hitters, scale = TRUE, ncomp = 1)
> summary(pls.fit)
Data: 	X dimension: 263 19 
	Y dimension: 263 1
Fit method: kernelpls
Number of components considered: 1
TRAINING: % variance explained
        1 comps  
X         38.08   
Salary    43.05

Notice that the percentage of variance in Salary that the one-component PLS fit explains, 43.05%, is almost as much as that explained using the final five-component model PCR fit, 44.90%. This is because PCR only attempts to maximize the amount of variance explained in the predictors, while PLS searches for directions that explain variance in both the predictors and the response.

Questions

Using the Boston dataset, try creating a PLS model with the full dataset and store it in pls.fit (use the ncomp you determined in the previous exercise)

Assume that:

The MASS and pls libraries have been loaded
The Boston dataset has been loaded and attached