As you become more experienced, you will find yourself needing to
perform the same operations over and over. A simple example is computing
averages. We can compute the average of a vector x
using the sum
and
length
functions: sum(x)/length(x)
. Because we do this repeatedly,
it is much more efficient to write a function that performs this
operation. This particular operation is so common that someone already
wrote the mean
function and it is included in base R. However, you
will encounter situations in which the function does not already exist,
so R permits you to write your own. A simple version of a function that
computes the average can be defined like this:
avg <- function(x){
s <- sum(x)
n <- length(x)
return(s/n)
}
Now avg
is a function that computes the mean:
x <- 1:100
identical(mean(x), avg(x))
#> [1] TRUE
Notice that variables defined inside a function are not saved in the
workspace. So while we use s
and n
when we call avg
, the values
are created and changed only during the call. Here is an illustrative
example:
s <- 3
avg(1:10)
#> [1] 5.5
s
#> [1] 3
Note how s
is still 3 after we call avg
.
In general, functions are objects, so we assign them to variable names
with <-
. The function function
tells R you are about to define a
function. The general form of a function definition looks like this:
my_function <- function(VARIABLE_NAME){
perform operations on VARIABLE_NAME and calculate VALUE
return(VALUE)
}
The value within the return
function is the value that will be returned from the function.
The functions you define can have multiple arguments as well as default values. For example, we can define a function that computes either the arithmetic or geometric average depending on a user defined variable like this:
avg <- function(x, arithmetic = TRUE){
n <- length(x)
ifelse(arithmetic, sum(x)/n, prod(x)^(1/n))
}
We will learn more about how to create functions through experience as we face more complex tasks.