As you become more experienced, you will find yourself needing to perform the same operations over and over. A simple example is computing averages. We can compute the average of a vector x using the sum and length functions: sum(x)/length(x). Because we do this repeatedly, it is much more efficient to write a function that performs this operation. This particular operation is so common that someone already wrote the mean function and it is included in base R. However, you will encounter situations in which the function does not already exist, so R permits you to write your own. A simple version of a function that computes the average can be defined like this:

avg <- function(x){
  s <- sum(x)
  n <- length(x)
  return(s/n)
}

Now avg is a function that computes the mean:

x <- 1:100
identical(mean(x), avg(x))
#> [1] TRUE

Notice that variables defined inside a function are not saved in the workspace. So while we use s and n when we call avg, the values are created and changed only during the call. Here is an illustrative example:

s <- 3
avg(1:10)
#> [1] 5.5
s
#> [1] 3

Note how s is still 3 after we call avg.

In general, functions are objects, so we assign them to variable names with <-. The function function tells R you are about to define a function. The general form of a function definition looks like this:

my_function <- function(VARIABLE_NAME){
  perform operations on VARIABLE_NAME and calculate VALUE
  return(VALUE)
}

The value within the return function is the value that will be returned from the function.

The functions you define can have multiple arguments as well as default values. For example, we can define a function that computes either the arithmetic or geometric average depending on a user defined variable like this:

avg <- function(x, arithmetic = TRUE){
  n <- length(x)
  ifelse(arithmetic, sum(x)/n, prod(x)^(1/n))
}

We will learn more about how to create functions through experience as we face more complex tasks.