Drop links or images here to add them to the editor.

In R, the most basic objects available to store data are vectors. As we have seen, complex datasets can usually be broken down into components that are vectors. For example, in a data frame, each column is a vector. Here we learn more about this important class.

Creating vectors

We can create vectors using the function c, which stands for concatenate. We use c to concatenate entries in the following way:

codes <- c(380, 124, 818)
codes
#> [1] 380 124 818

We can also create character vectors. We use the quotes to denote that the entries are characters rather than variable names.

country <- c("italy", "canada", "egypt")

In R you can also use single quotes:

country <- c('italy', 'canada', 'egypt')

But be careful not to confuse the single quote ’ with the back quote `.

By now you should know that if you type:

country <- c(italy, canada, egypt)

you receive an error because the variables italy, canada, and egypt are not defined. If we do not use the quotes, R looks for variables with those names and returns an error.

Names

Sometimes it is useful to name the entries of a vector. For example, when defining a vector of country codes, we can use the names to connect the two:

codes <- c(italy = 380, canada = 124, egypt = 818)
codes
#>  italy canada  egypt
#>    380    124    818

The object codes continues to be a numeric vector:

class(codes)
#> [1] "numeric"

but with names:

names(codes)
#> [1] "italy"  "canada" "egypt"

If the use of strings without quotes looks confusing, know that you can use the quotes as well:

codes <- c("italy" = 380, "canada" = 124, "egypt" = 818)
codes
#>  italy canada  egypt
#>    380    124    818

There is no difference between this function call and the previous one. This is one of the many ways in which R is quirky compared to other languages.

We can also assign names using the names functions:

codes <- c(380, 124, 818)
country <- c("italy","canada","egypt")
names(codes) <- country
codes
#>  italy canada  egypt
#>    380    124    818

Sequences

Another useful function for creating vectors generates sequences:

seq(1, 10)
#>  [1]  1  2  3  4  5  6  7  8  9 10

The first argument defines the start, and the second defines the end which is included. The default is to go up in increments of 1, but a third argument lets us tell it how much to jump by:

seq(1, 10, 2)
#> [1] 1 3 5 7 9

If we want consecutive integers, we can use the following shorthand:

1:10
#>  [1]  1  2  3  4  5  6  7  8  9 10

When we use these functions, R produces integers, not numerics, because they are typically used to index something:

class(1:10)
#> [1] "integer"

However, if we create a sequence including non-integers, the class changes:

class(seq(1, 10, 0.5))
#> [1] "numeric"

Subsetting

We use square brackets to access specific elements of a vector. For the vector codes we defined above, we can access the second element using:

codes[2]
#> canada
#>    124

You can get more than one entry by using a multi-entry vector as an index:

codes[c(1,3)]
#> italy egypt
#>   380   818

The sequences defined above are particularly useful if we want to access, say, the first two elements:

codes[1:2]
#>  italy canada
#>    380    124

If the elements have names, we can also access the entries using these names. Below are two examples.

codes["canada"]
#> canada
#>    124
codes[c("egypt","italy")]
#> egypt italy
#>   818   380