In ggplot2, aesthetic means “something you can see”. Every layer has a specific set of aesthetic variables. These parameters provide all information to draw the graph and all its components. The data (vectors) passed to the aesthetic parameters must have the same length as the amount of rows in the given dataframe or must be of length 1. Passing a vector of length 1 will define the aesthetic for all rows in the dataframe.
First, we need to understand that any aesthetic in ggplot2 (such as x, y, colour, size, shape, etc.) can be defined in two distinct ways in your plots: setting and mapping.
In most cases you will set aesthetics when you want to set a constant aesthetic value for ALL data entries (vector of length one)
Setting an aesthetic can be done by passing them as a parameter to the ggplot (layer) functions.
murders %>% ggplot() +
geom_point(x=murders$population/10^6, y=murders$total)
murders %>% ggplot() +
geom_point(x=murders$population/10^6, y=murders$total)
This is because there are no scales defined, after defining the scales and axis labels we see the expected plot appear:
murders %>% ggplot() +
geom_point(x=murders$population/10^6, y=murders$total) +
# add axis scales and labels
scale_y_continuous(limits= c(0,max(murders$total)), name = "total") +
scale_x_continuous(limits= c(0,max(murders$population/10^6)), name = "population/10^6")
colour
aesthectic to "blue"
:
murders %>% ggplot() +
geom_point(x=murders$population/10^6, y=murders$total, colour="blue") +
# add axis scales and labels
scale_y_continuous(limits= c(0,max(murders$total)), name = "total") +
scale_x_continuous(limits= c(0,max(murders$population/10^6)), name = "population/10^6")
In most cases you will map aesthetics when you want to map every data entry with it’s own aesthetic value.
To map aesthetics you will need to define the aesthetics in the aes() function and not directly in the ggplot (layer) function. The aes() function will then return the aesthetic mapping which can be passed to the ggplot (layer) functions.
aes(x = murders$population/10^6, y = murders$total)
#> Aesthetic mapping:
#> * `x` -> `murders$population`
#> * `y` -> `murders$total`
The aes
function inside a ggplot (layer) function also uses the variable names from the given dataframe: we can use population
and total
without having to call them as murders$population
and murders$total
(Same as dplyr functions: filter, mutate, select, etc..).
murders %>% ggplot() +
geom_point(aes(x = population/10^6, y = total))
We can drop the x=
and y=
since these are the first and second expected arguments, you can find this info in the help page.
murders %>% ggplot() +
geom_point(aes(population/10^6, total))
murders %>% ggplot() +
geom_point(aes(population/10^6, total))
colour
aethetic to "blue"
the geom_point layer interprets this as a categorical variable (not as the colour blue) and gives all points a color depending on its category. The geom_point layer will choose the colour automatically, in this case the default is red. If there were more categories then ggplot would automatically assign a different colour to each category.
murders %>% ggplot() +
geom_point(aes(population/10^6, total, colour="blue"))
To fix this, we can instead set the colour aesthetic to “blue” by specifying it outside the aes function.
murders %>% ggplot() +
geom_point(aes(population/10^6, total), colour="blue")
Despite being equivalent, both setting and mapping have advantages and disadvantages. The key is to combine these methods depending on your goals.