This exercise involves the use of simple linear regression on the College data set. The dataset is available in the ISLR2 library.

Note: this exercise is primarily based on the lab of chapter 3 (Linear Regression).

plot

Questions

Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.
Answer the multiple choice questions by e.g. typing MC1 <- 1 if you think answer 1 of MC1 is correct.

  1. Use the lm() function to perform a simple linear regression with the number of applications (Apps) as the response and the number of accepted applications (Accept) as the predictor. Store the model in lm.fit1. Use the summary() function to print the results.

  2. What is the predicted Apps associated with an Accept of 2000? Store the answer in pred.

  3. What are the associated 95% confidence and prediction intervals of the prediction with an Accept of 2000? Store the answers in conf.int.95 and pred.int.95.

  4. MC1:
    A) Based on the F-statistic we can conclude that there is a relationship between the predictor and the response.
    B) A confidence interval is always wider than the prediction interval.
    • 1) Both statements are true.
    • 2) Both statements are false.
    • 3) A is true and B is false.
    • 4) A is false and B is true.

  5. MC2:
    A) The low p-value of Accept hints at a weak relationship between Accept and Apps.
    B) Since the coefficient of Accept is positive, we can conclude that the more Accept a university has, the higher the Apps will be (on average).
    • 1) Both statements are true.
    • 2) Both statements are false.
    • 3) A is true and B is false.
    • 4) A is false and B is true.

  6. Plot the response and the predictor with the plot() function. Use the abline() function to display the least squares regression line.

  7. Create two other linear models with different predictors (use the poly function with default argument raw=FALSE):
    • lm.fit2: a second-order polynomial of Accept.
    • lm.fit3: a third-order polynomial of Accept.

  8. Perform an ANOVA analysis on the 3 models and interpret the output. Store the result in college.anova.
    MC3: which model explains the data best?
    • 1) lm.fit1
    • 2) lm.fit2
    • 3) lm.fit3


Assume that: