This problem makes use of the BrainCancer dataset in he ISLR2 package.

Questions

  1. Plot the Kaplan-Meier survival curve with 95 % confidence bands (i.e. the default value), using the survfit() function in the survival package. Store the model in the variable fit.km1.

  2. Fit a Cox proportional hazards model that uses all of the predictors to predict survival, store the model in the variable fit.cox. Summarize the main findings and answer the questions below.

    • MC1 : Which predictors/covariates are significant in predicting survival?
      • 1: sexMale and ki
      • 2: diagnosisLG glioma and diagnosisHG glioma
      • 3: diagnosisHG glioma and ki
      • 4: diagnosisLG glioma and ki

    • MC2 :
      A) The risk associated with being female is 1.20 times the risk for males
      B) The risk associated with HG Glomia is 8.62 times the risk for LG Glomia
      • 1: Both statements are true
      • 2: Both statements are false
      • 3: A is true, B is false
      • 4: A is false, B is true

  3. Finally, we plot survival curves for each value of ki, adjusting for the other predictors (!!!).
    1. Check unique values of ki and replace the value 40 by the value 60 (i.e. 40 becomes 60). Store the adjusted column ki back in the BrainCancer dataframe. (Hint: you can use the function unique() to check the unique values of ki).
    2. Create a dataframe with the values of the other predictors equal to the mean for quantitative variables, and the mode for factors. Also, sort the values of ki from low to high and order the columns as follows: ki, sex, diagnosis, loc, gtv, stereo. Store the dataframe in modeldata. You can use this function to get the mode of a factor:
      mode <- function(col) {
        return(names(which.max(table(col))))
      }
      
    3. Plot Kaplan-Meier curves for each of the five strata. Store the model in the variable fit.km2.

Assume that: