# 7Relational and boolean operations

version
R 4.3.2

You’ve already been exposed to a few examples of relational and boolean operations in earlier chapters. A formal exploration of these techniques follow.

## 7.1 Relational operations

Relational operations play an important role in data manipulation. Anytime you subset a dataset based on one or more criterion, you are making use of a relational operation. The relational operators (also known as logical binary operators) include `==`, `!=`, `<`, `<=`, `>` and `>=`. The output of a condition is a logical vector `TRUE` or `FALSE`.

Relational operator Syntax Example
Exact equality `==` 3 == 4 -> FALSE
Exact inequality `!=` 3 != 4 -> TRUE
Less than `<` 3 < 4 -> TRUE
Less than or equal `<=` 4 <= 4 -> TRUE
Greater than `>` 3 > 4 -> FALSE
Greater than or equal `>=` 4 >= 4 -> TRUE

## 7.2 Boolean operations

Boolean operations can be used to piece together multiple evaluations.

R has three boolean operators: The AND operator, `&`; The NOT operator, `!`; And the OR operator, `|`.

The `&` operator requires that the conditions on both sides of the boolean operator be satisfied. You would normally use this operator when addressing a condition along the lines of `x` must be satisfied AND `y` must be satisfied”.

The `|` operator requires that at least one condition be met on either side of the boolean operator. You would normally use this operator when addressing a condition along the lines of “`x` must be satisfied OR `y` must be satisfied”. Note that the output will also be TRUE if both conditions are met.

The `!` operator is a negation operator. It will reverse the outcome of a condition. So if the outcome of an expression is `TRUE`, preceding that expression with `!` will reverse the outcome to `FALSE` and vice-versa.

Boolean operator Syntax Example Outcome
AND `&`

4 == 3 `&` 1 == 1

4 == 4 `&` 1 == 1

FALSE

TRUE

OR `|`

4 == 4 `|` 1 == 1

4 == 3 `|` 1 == 1

4 == 3 `|` 1 == 2

TRUE

TRUE

FALSE

NOT `!`

`!`(4 == 3)

`!`(4 == 4)

TRUE

FALSE

The following table breaks down all possible Boolean outcomes where `T` = `TRUE` and `F` = `FALSE`:

Boolean operation Outcome
T `&` T TRUE
T `&` F FALSE
F `&` F FALSE
T `|` T TRUE
T `|` F TRUE
F `|` F FALSE
`!`T FALSE
`!`F TRUE

If the input values to a boolean operation are numeric vectors and not logical vectors, the numeric values will be interpreted as `FALSE` if zero and `TRUE` otherwise. For example:

``1 & 2``
``[1] TRUE``
``1 & 0``
``[1] FALSE``

### 7.2.1 Pecking order in operations

Note that the operation `a == (3 | 4)` is not the same as `(a == 3) | (a == 4)`. If, for example, `a = 3`, the former will return `FALSE` whereas the latter will return `TRUE`.

``````a <- 3
a == (3 | 4)``````
``[1] FALSE``
``(a == 3) | (a == 4)``
``[1] TRUE``

This is because R applies a pecking order to its operations. In the former case, R is first evaluating what is in between the parentheses, `(3 | 4)`.

``(3 | 4)``
``[1] TRUE``

This returns `TRUE` since the numbers on either side of `|` are converted to `TRUE` (only values of `0` are converted to `FALSE`). It then compares `a` to this logical vector `TRUE`.

``a == TRUE``
``[1] FALSE``

Here, the `==` operator requires that both sides of the operation be of the same data type. `a` is numeric and `TRUE` is logical. Recall from Chapter 3 that R circumvents differences in data types by coercing all values to the highest common mode (see the chapter on data types). Here, `numeric` overrides `logical` type thus coercing the `TRUE` variable to its `numeric` data type representation, `1`. Hence, the evaluation being performed is:

``a == 1``
``[1] FALSE``

When a vector is evaluated for more than one condition, you need to explicitly break down each condition before combining them with boolean operators.

``(a == 3) | (a == 4)``
``[1] TRUE``

The above is an example of R’s built-in operation precedence rules. For example, comparison operations such as `<=` and `>` are performed before boolean operations such that `a == 3 | 4` will first evaluate `a == 3` before evaluating `... | 4`.

Even boolean operations follow a pecking order such that `!` precedes `&` which precedes `|`. For example:

``! TRUE & FALSE | TRUE``

will first evaluate `! TRUE`, then `... & FALSE`, then `... | TRUE`.

To overrride R’s built-in precedence, use parentheses. For example:

``! TRUE & (FALSE | TRUE)```

will first evaluate `(FALSE | TRUE)` and `! TRUE` separately, then their output will be combined with `... & ...`.

For a full list of operation precedence, access the help page for `Syntax`.

``?Syntax``

The following lists the pecking order from high to low precedence (i.e. top operation is performed before bottom operation).

 :: ::: access variables in a namespace \$ @ component / slot extraction [ [[ indexing ^ exponentiation (right to left) - + unary minus and plus : sequence operator %any% |> specialoperators (including %% and %/%) * / multiply, divide + - (binary) add, subtract < > <= >= == != ordering and comparison ! negation & && and | || or ~ as in formulae -> ->> rightwards assignment <- <<- assignment (right to left) = assignment (right to left) ? help

## 7.3 Comparing multidimensional objects

The relational operators are used to compare single elements (i.e. one element at a time). If you want to compare two objects as a whole (e.g. multi-element vectors or data frames), use the `identical()` function. For example:

``````a <- c(1, 5, 6, 10)
b <- c(1, 5, 6)
identical(a, a)``````
``[1] TRUE``
``identical(a, b)``
``[1] FALSE``
``identical(mtcars, mtcars)``
``[1] TRUE``

Notice that `identical` returns a single logical vector, regardless the input object’s dimensions.

Note that the data structure must match as well as its element values. For example, if `d` is a list and `a` is an atomic vector, the output of `identical` will be false even if the internal values match.

``````d <- list( c(1, 5, 6, 10) )
identical(a, d)``````
``[1] FALSE``

If we convert `d` from a list to an atomic vector using the `unlist` function (thus matching data structures), we get:

``identical(a, unlist(d))``
``[1] TRUE``

## 7.4 The match operator `%in%`

The match operator `%in%` compares two sets of vectors and assesses if an element on the left-hand side of `%in%` matches any of the elements on the right-hand side of the operator. For each element in the left-hand vector, R returns `TRUE` if the value is present in any of the right-hand side elements or `FALSE` if not.

For example, given the following vectors:

``````v1 <- c( "a", "b", "cd", "fe")
v2 <- c( "b", "e")``````

find the elements in `v1` that match any of the values in `v2`.

``v1 %in% v2``
``[1] FALSE  TRUE FALSE FALSE``

The function checks whether each element in `v1` has a matching value in `v2`. For example, element `"a"` in `v1` is compared to elements `"b"` and `"e"` in `v2`. No matches are found and a `FALSE` is returned. The next element in `v1`, `"b"`, is compared to both elements in `v2`. This time, there is a match (`v2` has an element `"b"`) and `TRUE` is returned. This process is repeated for all elements in `v1`.

The logical vector output has the same length as the input vector `v1` (four in this example).

If we swap the vector objects, we get a two element logical vector since we are now comparing each element in `v2` to any matching elements in `v1`.

``v2 %in% v1``
``[1]  TRUE FALSE``

## 7.5 Checking if a value is `NA`

When assessing if a value is equal to `NA` the following evaluation may behave unexpectedly.

``````a <- c (3, 67, 4, NA, 10)
a == NA``````
``[1] NA NA NA NA NA``

The output is not a logical data type we would expect from an evaluation. Instead, you must make use of the `is.na()` function:

``is.na(a)``
``[1] FALSE FALSE FALSE  TRUE FALSE``

As another example, if we want to keep all rows in dataframe `d` where `z` = `NA`, we would type:

``````d <- data.frame(x = c(1,4,2,5,2,3,NA),
y = c(3,2,5,3,8,1,1),
z = c(NA,NA,4,9,7,8,3))

d[ is.na(d\$z), ]``````
``````  x y  z
1 1 3 NA
2 4 2 NA``````

You can, of course, use the `!` operator to reverse the evaluation and omit all rows where `z` = `NA`,

``d[ !is.na(d\$z), ]``
``````   x y z
3  2 5 4
4  5 3 9
5  2 8 7
6  3 1 8
7 NA 1 3``````