Conditional expressions are one of the basic features of programming. They are used for what is called flow control. The most common conditional expression is the if-else statement. In R, we can actually perform quite a bit of data analysis without conditionals. However, they do come up occasionally, and you will need them once you start writing your own functions and packages.

Here is a very simple example showing the general structure of an if-else statement. The basic idea is to print the reciprocal of a unless a is 0:

a <- 0

if(a!=0){
  print(1/a)
} else{
  print("No reciprocal for 0.")
}
#> [1] "No reciprocal for 0."

Let’s look at one more example using the US murders data frame:

library(dslabs)
data(murders)
murder_rate <- murders$total / murders$population*100000

Here is a very simple example that tells us which states, if any, have a murder rate lower than 0.5 per 100,000. The if statement protects us from the case in which no state satisfies the condition.

ind <- which.min(murder_rate)

if(murder_rate[ind] < 0.5){
  print(murders$state[ind]) 
} else{
  print("No state has murder rate that low")
}
#> [1] "Vermont"

If we try it again with a rate of 0.25, we get a different answer:

if(murder_rate[ind] < 0.25){
  print(murders$state[ind]) 
} else{
  print("No state has a murder rate that low.")
}
#> [1] "No state has a murder rate that low."

A related function that is very useful is ifelse. This function takes three arguments: a logical and two possible answers. If the logical is TRUE, the value in the second argument is returned and if FALSE, the value in the third argument is returned. Here is an example:

a <- 0
ifelse(a > 0, 1/a, NA)
#> [1] NA

The function is particularly useful because it works on vectors. It examines each entry of the logical vector and returns elements from the vector provided in the second argument, if the entry is TRUE, or elements from the vector provided in the third argument, if the entry is FALSE.

a <- c(0, 1, 2, -4, 5)
result <- ifelse(a > 0, 1/a, NA)

This table helps us see what happened:

a is\_a\_positive answer1 answer2 result
0 FALSE Inf NA NA
1 TRUE 1.00 NA 1.0
2 TRUE 0.50 NA 0.5
\-4 FALSE \-0.25 NA NA
5 TRUE 0.20 NA 0.2

Here is an example of how this function can be readily used to replace all the missing values in a vector with zeros:

data(na_example)
no_nas <- ifelse(is.na(na_example), 0, na_example) 
sum(is.na(no_nas))
#> [1] 0

Two other useful functions are any and all. The any function takes a vector of logicals and returns TRUE if any of the entries is TRUE. The all function takes a vector of logicals and returns TRUE if all of the entries are TRUE. Here is an example:

z <- c(TRUE, TRUE, FALSE)
any(z)
#> [1] TRUE
all(z)
#> [1] FALSE