Load the dplyr package and the murders dataset.
library(dplyr)
library(dslabs)
data(murders)
You can add columns using the dplyr function mutate
. This function
is aware of the column names and inside the function you can call them
unquoted:
murders <- mutate(murders, population_in_millions = population / 10^6)
We can write population
rather than murders$population
. The function
mutate
knows we are grabbing columns from murders
.
Pitfall
Like every function in R the mutate function doesn’t mutate its parameters (the given dataframe), it only returns a new dataframe. If you don’t store the returned dataframe or override your existing dataframe as in the example above, your mutated dataframe will be lost.
Use the function mutate
to add a column named rate
to the murders
dataframe. This column should contain the per 100,000 murder rate. Store your result in murders_rate
.
Use the function mutate
to add another column named rank
to the murders_rate
dataframe. This column should contain the rank, from highest to lowest murder rate. Store your result in murders_rate_rank
.
Hint
If
rank(x)
gives you the ranks ofx
from lowest to highest,rank(-x)
gives you the ranks from highest to lowest
Use select
to show the state names (state
), rate (rate
) and rank (rank
) variables in murders_rate_rank
. Store your result in murders
(this will overwrite the original murders dataframe).