In order to fit a logistic regression GAM, we once again use the I() function in constructing the binary response variable, and set family=binomial.

gam.lr <- gam(I(wage > 250) ~ year + s(age, df = 5) + education, family = binomial, data = Wage)
par(mfrow = c(1, 3))
plot(gam.lr, se = T, col = "green")

plot

It is easy to see that there are no high earners in the <HS category:

table(education, I(wage > 250))

education            FALSE TRUE
  1. < HS Grad         268    0
  2. HS Grad           966    5
  3. Some College      643    7
  4. College Grad      663   22
  5. Advanced Degree   381   45

Hence, we fit a logistic regression GAM using all but this category. This provides more sensible results.

gam.lr.s <- gam(I(wage > 250) ~ year + s(age, df = 5) + education, family = binomial, data = Wage, 
                subset = (education != "1. < HS Grad"))
plot(gam.lr.s, se = T, col = "green")

plot

Questions

Assume that: