The MASS
library contains the Boston
data set, which records medv
(median
house value) for 506 neighborhoods around Boston
. We will seek to predict
medv
using 13 predictors such as rm
(average number of rooms per house),
age
(average age of houses), and lstat
(percent of households with low
socioeconomic status).
> head(Boston)
crim zn indus chas nox rm age
1 0.00632 18 2.31 0 0.538 6.575 65.2
2 0.02731 0 7.07 0 0.469 6.421 78.9
3 0.02729 0 7.07 0 0.469 7.185 61.1
4 0.03237 0 2.18 0 0.458 6.998 45.8
5 0.06905 0 2.18 0 0.458 7.147 54.2
6 0.02985 0 2.18 0 0.458 6.430 58.7
dis rad tax ptratio black lstat medv
1 4.0900 1 296 15.3 396.90 4.98 24.0
2 4.9671 2 242 17.8 396.90 9.14 21.6
3 4.9671 2 242 17.8 392.83 4.03 34.7
4 6.0622 3 222 18.7 394.63 2.94 33.4
5 6.0622 3 222 18.7 396.90 5.33 36.2
6 6.0622 3 222 18.7 394.12 5.21 28.7
> names(Boston)
[1] "crim" "zn" "indus" "chas" "nox" "rm" "age"
[8] "dis" "rad" "tax" "ptratio" "black" "lstat" "medv"
To find out more about the data set, we can also type ?Boston
.
(This ?
feature only works because of the specific nature of this dataset, that is attached with the book. This will not work for a traditional data set.)
Try running the head()
and names()
function for yourself and store the output in Boston.head
and Boston.names
respectively.
Note: Here, we ask you to store the output of these functions in variables. In this way you will not see the output in your console. However, we encourage you to look at the output of these functions, so you get a feeling of what exactly you are doing.
Assume that:
MASS
library has been loadedBoston
dataset has been loaded and attached