Associated with each value of \(\lambda\) is a vector of ridge regression coefficients,
stored in a matrix that can be accessed by coef()
. In this case, it is a 20×100
matrix, with 20 rows (one for each predictor, plus an intercept) and 100
columns (one for each value of \(\lambda\)).
> dim(coef(ridge.mod))
[1] 20 100
We expect the coefficient estimates to be much smaller, in terms of \(\ell_2\) norm, when a large value of \(\lambda\) is used, as compared to when a small value of \(\lambda\) is used. These are the coefficients when \(\lambda = 11,498\), along with their \(\ell_2\) norm:
> ridge.mod$lambda[50]
[1] 11497.57
> coef(ridge.mod)[, 50]
(Intercept) AtBat Hits HmRun
407.356050200 0.036957182 0.138180344 0.524629976
Runs RBI Walks Years
0.230701523 0.239841459 0.289618741 1.107702929
CAtBat CHits CHmRun CRuns
0.003131815 0.011653637 0.087545670 0.023379882
CRBI CWalks LeagueN DivisionW
0.024138320 0.025015421 0.085028114 -6.215440973
PutOuts Assists Errors NewLeagueN
0.016482577 0.002612988 -0.020502690 0.301433531
> sqrt(sum(coef(ridge.mod)[-1, 50]^2))
[1] 6.360612
In contrast, here are the coefficients when \(\lambda\) = 705, along with their \(\ell_2\) norm. Note the much larger \(\ell_2\) norm of the coefficients associated with this smaller value of \(\lambda\).
> ridge.mod$lambda[60]
[1] 705.4802
> coef(ridge.mod)[, 60]
(Intercept) AtBat Hits HmRun
54.32519950 0.11211115 0.65622409 1.17980910
Runs RBI Walks Years
0.93769713 0.84718546 1.31987948 2.59640425
CAtBat CHits CHmRun CRuns
0.01083413 0.04674557 0.33777318 0.09355528
CRBI CWalks LeagueN DivisionW
0.09780402 0.07189612 13.68370191 -54.65877750
PutOuts Assists Errors NewLeagueN
0.11852289 0.01606037 -0.70358655 8.61181213
> sqrt(sum(coef(ridge.mod)[-1, 60]^2))
At index 40, \(\lambda = 43\). Try calculating the corresponding \(\ell_2\) and store it in ell2
:
Assume that:
ISLR2
and glmnet
libraries have been loadedHitters
dataset has been loaded and attached