Flow control
In this section, we will review flow-control statements that you can use when programming with R to simplify repetitive tasks and make your code more legible. Programming with R involves putting together instructions that the computer will execute to fulfill a certain task. As you have noticed this far, R commands consist mainly of expressions or functions to be evaluated. Most programs are repetitive and depend on user input prior to executing a task. Flow-control statements are particularly important in this process because it allows you to tell the computer how many times an expression is to be repeated or when a statement is to be executed. In the rest of this chapter, we will go through flow-control statements and tips that you can use to write and debug your own programs.
The for() loop
The
for(i in vector){commands}
statement allows you to repeat the code written in brackets {}
for each element (i
) in your vector in parenthesis.
You can use for()
loops to evaluate mathematical expressions. For example, the Fibonacci
sequence is defined as a series of numbers in which each number is the sum of the two preceding numbers. We can get the first 15 numbers that make up the Fibonacci
sequence starting from (1, 1)
, using the following code:
> # First we create a numeric vector with 15 elements to store the data generated. > Fibonacci <- numeric(15) > Fibonacci [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Next, we need to write down the code that will allow us to generate the Fibonacci
sequence. If the first two elements of the sequence are (1, 1)
and every subsequent number is the sum of the two preceding numbers, then the third element is 1 + 1 = 2 and the fourth element is 1 + 2 = 3, and so on.
So, let's add the two first elements of the Fibonacci
sequence in our Fibonacci
vector as shown:
> Fibonacci[1:2] <- c(1,1)
Next, let's create a for()
loop, which will add the sum of the two preceding numbers indexed at i-2
and i-1
from i=3
to i=15
(the length of the Fibonacci
numeric vector we initially created):
> for(i in 3:length(Fibonacci)){Fibonacci[i] <- Fibonacci[i-2] + Fibonacci[i-1]} > Fibonacci [1] 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610
In this example, the vector evaluated by the for()
loop is 3:length(Fibonacci)
, but we could have also expressed the vector as c(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
or seq(3, 15, by=1)
. To simplify our code, we can create a separate vector to store the sequence and then write our for()
loop as follows:
> Fibonacci_terms <- seq(3, 15, by=1) > for(i in Fibonacci_terms){Fibonacci[i] <- Fibonacci[i-2] + Fibonacci[i-1]}
You don't always have to use a numeric or integer vector when writing for()
loops. For example, you can use a character vector in a for()
loop to update strings in another vector as follows:
> fruits <- c("apple", "pear", "grapes") > other_fruits <- c("banana", "lemon") > for (i in fruits){other_fruits <-c(other_fruits, i)} #appends fruits to other_fruits vector > other_fruits [1] "banana" "lemon" "apple" "pear" "grapes"
The apply() function
A good alternative to the for()
loop is the apply()
function, which allows you to apply a function to a matrix or array by row, column, or both. For example, let's calculate the mean of a matrix by row using the apply()
function. First, let's create a matrix as follows:
> m1 <-matrix(1:12, nrow=3) > m1 [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12
The second argument of the apply()
function is MARGIN
, which allows you to specify whether the function should be applied by row with 1
, by column with 2
, or both with c(1,2)
. Since we want to calculate the mean by row, we will use 1
for MARGIN
, as follows:
> meanByrow <- apply(m1, 1, mean) > meanByrow [1] 5.5 6.5 7.5
The last argument of the apply()
function is FUN
, which refers to the function to be applied to the matrix. In our last example, we used the mean()
function. However, you can use any function including those you wish to write yourself. For example, let's apply the x+3
function to each value in the matrix as follows:
# Notice there is no comma between function(x) and x+3 when defining the function in apply() > m1plus3 <- apply(m1, c(1,2), function(x) x+3) > m1plus3 [,1] [,2] [,3] [,4] [1,] 4 7 10 13 [2,] 5 8 11 14 [3,] 6 9 12 15
In the event that you want to specify arguments of a function, you just need to add them after the function. For example, let's say you want to apply the mean function by column to a second matrix but this time by specifying the na.rm
argument as TRUE
instead of the default (FALSE
). Let's take a look at that in that in the following example:
> z <- c( 1, 4, 5, NA, 9,8, 3, NA) > m2 <- matrix(z, nrow=4) > m2 [,1] [,2] [1,] 1 9 [2,] 4 8 [3,] 5 3 [4,] NA NA # Notice you need to separate the argument from its function with a comma > meanByColumn <- apply(m2, 2, mean, na.rm=TRUE) > meanByColumn [1] 3.333333 6.666667
The if() statement
The if(condition){commands}
statement allows you to evaluate a condition and if it returns TRUE
, the code in brackets will be executed. You can add an else {commands}
statement to your if()
statement if you would like to execute a block of code if your condition returns FALSE
:
> x <- 4 > # we indent our code to make it more legible > if(x < 10) { x <-x+4 print(x) } [1] 8
If you have several conditions to test before running an else {}
statement, you can use an else if(condition){commands}
statement as follows:
> x <- 1 > if(x == 2) { x <- x+4 print("X is equal to 2, so I added 4 to it.") } else if (x > 2) { print("X is greater than 2, so I did nothing to it.") } else { x <- x -4 print("X is not greater than or equal to 2, so I subtracted 4 from it.") } [1] "X is not greater than or equal to 2, so I subtracted 4 from it."
The while() loop
The while(condition){commands}
statement allows you to repeat a block of code until the condition in the parenthesis returns FALSE
. If we look back at our Fibonacci
sequence example, we could have written our program using a while()
loop instead, as follows:
First, we create two objects to store the first and second number of the Fibonacci
sequence:
> num1 <- 1 > num2 <- 1
Then, we create a numeric vector to contain the first two numbers of the Fibonacci
sequence:
> Fibonacci <- c(num1, num2)
Next, we create a count
object to store the number of elements added to the Fibonacci
vector. We start the count at 2
since the first two numbers have already been added to the Fibonacci
vector as follows:
> count <- 2 #set count to start from 2 > while(count < 15) { #We update the count number so that we can track the number of times the loop is repeated. count <- count +1 #Next we make sure to store the 2nd number in a new object before it is overwritten. oldnum2 <- num2 #Then we calculate the next number in the Fibonacci sequence. num2 <- num1 + num2 #Then we update the Fibonacci vector with the 2nd number each time the loop is repeated. Fibonacci <- c(Fibonacci, num2) #Lastly, we assign the 2nd number as the new first number to use in the next iteration of the loop. num1 <- oldnum2 } > Fibonacci [1] 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610
The repeat{} and break statement
The repeat{commands}
statement is similar to the while()
loop except that you do not need to set a condition to test, and your code is repeated endlessly unless you include a break
statement. Typically, a repeat{}
statement includes an if(condition) break
line, but this is not required. The break
statement causes the loop to terminate immediately.
If we go back to our Fibonacci example, we could have written the code as follows:
> num1 <- 1 > num2 <- 1 Fibonacci <- c(num1, num2) > count <- 2 > repeat { count <- count +1 oldnum2 <- num2 num2 <- num1 + num2 Fibonacci <- c(Fibonacci, num2) num1 <- oldnum2 if (count >= 15) { break } }