This exercise will use the biopsy
dataset from the MASS package.
This dataset contains biopsies of breast tumours for 699 patients.
Each patient was scored on nine attributes V1
through V9
with a score on a scale of 1 to 10.
These attributes are things such as clump thickness, uniformity of the cell size and shape etc.
You can find a full description of the attributes with the ?biopsy
command.
Additionally, for each patient we know whether their breast tumours were benign or malignant in the class
variable.
Our goal in these exercises will be to create a classification model that can accurately diagnose
a patient (benign or malignant tumour).
Let us start off by omitting the rows with NA values and dropping the ID column, since we will not need it for our analysis.
library(MASS)
data(biopsy)
biopsy <- biopsy[,2:11]
biopsy <- na.omit(biopsy)
class
with the attributes V1
through V9
as predictors and
store the model in glm.fit
Use the
.
notation to make your life easier!
Assume that:
MASS
library has been loadedbiopsy
dataset has been loaded and attached