This exercise will use the biopsy dataset from the MASS package. This dataset contains biopsies of breast tumours for 699 patients. Each patient was scored on nine attributes V1 through V9 with a score on a scale of 1 to 10. These attributes are things such as clump thickness, uniformity of the cell size and shape etc. You can find a full description of the attributes with the ?biopsy command. Additionally, for each patient we know whether their breast tumours were benign or malignant in the class variable. Our goal in these exercises will be to create a classification model that can accurately diagnose a patient (benign or malignant tumour).

Let us start off by omitting the rows with NA values and dropping the ID column, since we will not need it for our analysis.

library(MASS)
data(biopsy)
biopsy <- biopsy[,2:11]
biopsy <- na.omit(biopsy)

Question


Assume that: