Load the dplyr package and the murders dataset.

library(dplyr)
library(dslabs)
data(murders)

You can add columns using the dplyr function mutate. This function is aware of the column names and inside the function you can call them unquoted:

murders <- mutate(murders, population_in_millions = population / 10^6)

We can write population rather than murders$population. The function mutate knows we are grabbing columns from murders.

Pitfall

Like every function in R the mutate function doesn’t mutate its parameters (the given dataframe), it only returns a new dataframe. If you don’t store the returned dataframe or override your existing dataframe as in the example above, your mutated dataframe will be lost.

Exercise

  1. Use the function mutate to add a column named rate to the murders dataframe. This column should contain the per 100,000 murder rate. Store your result in murders_rate.

  2. Use the function mutate to add another column named rank to the murders_rate dataframe. This column should contain the rank, from highest to lowest murder rate. Store your result in murders_rate_rank.

    Hint

    If rank(x) gives you the ranks of x from lowest to highest, rank(-x) gives you the ranks from highest to lowest

  3. Use select to show the state names (state), rate (rate) and rank (rank) variables in murders_rate_rank. Store your result in murders (this will overwrite the original murders dataframe).