General programming and debugging tools
Since this chapter is meant to review R programming, I will not go into too much detail on how to write a program step by step, but I will present some general advice on how to write a successful program.
First, it is essential that you understand the problem because R will only do what you tell it to do. So if you don't have a clear picture of the problem, it's best you sit down and work out what you want your program to do and think about what R tools and/or packages are available to help you fulfill your task. Once you've explored the R functions and packages available to you to help address your question, you should simplify your problem by writing down general steps and functions you can use to solve your problem and then translate your general ideas into a detailed implementation.
A good strategy to adopt when working on a detailed implementation for a program is to use the "top-down" design approach, which consists of writing the whole program in a couple of steps like you would an essay outline. Then, expand each step with additional key steps and keep expanding until you have a full program. To save time and make your code more legible, I would suggest breaking up each of your key steps into functions, and then run and check each function iteratively. As a general rule of thumb, if your function starts to get really long, that is, dozens of line, I would suggest thinking of ways to break down that function into a bunch of smaller functions or "subfunctions", in the same way you would break down really long paragraphs into smaller ones when writing an essay.
The beauty of programming resides in the ability to write and reuse functions in several programs. By writing generic functions that fulfill specific tasks, you can reuse that code in another program by simply executing the following code:
> source("someOtherfunctions.R")
The trickiest part of programming is finding and solving errors (debugging). The following is a list of some generic steps you can take when trying to solve a bug:
- Recognize that your program has a bug. This can be easy when you get an error or warning message but harder when you get an output that is not the output expected or the true answer to your problem.
- Make the bug reproducible. It is easier to fix a bug that you know how to trigger.
- Identify the cause of the bug. For example, this can be a variable, not updating it the way you wanted it to in a function, or a condition statement that can never return
TRUE
as written. Other common causes of error for beginners include testing for a match (equality) by writingif(x = 12)
instead ofif(x==12)
, or the inability of your code to deal with missing data (NA values). - Fix the error in your code and test whether you successfully fixed it.
- Look for similar errors elsewhere in your code.
Tip
One trick you can use to help you tease out the cause of your error message is the traceback()
function. For example, when we tried to the vectorContains(x)
, we got the error message "This function takes a numeric vector as input."
If someone wanted to see where the error message was coming from, they could run traceback()
and get the location as follows:
> traceback() 2: stop("This function takes a numeric vector as input.") at #38 1: vectorContains(x)
Other useful functions include the browser()
and debug()
functions. The browser()
function allows you to pause the execution of your function, and examine or change local variables, and even execute other R commands. Let's inspect the vectorContains()
function we wrote earlier with the browser()
function as follows:
> x <- c(2, 6, 7, 12, NA, NA) > browser() # We have now entered the Browser mode. Browse[1]> x <-c(1, 2, 3) Browse[1]> vectorContains(x) Error in vectorContains(x) : This function takes a numeric vector without NAs as input. Browse[1]> x <-c(1, 2, 3) Browse[1]> vectorContains(x) [1] TRUE Browse[1]> Q #To quit browser()
Note
Note that the variable x
we changed in the browser
mode was stored to our workspace. So if we enter x
after we quit, the values stored in browser
mode will be returned, as follows:
> x [1] 1 2 3
When we call the debug()
function, we also enter the browser
mode. This allows us to execute a single line of code at a time by entering n
for next, continue to run the function by entering c
, or quit the function by entering Q
like in browser
mode. Note that each time you call the function, you will enter the browser
mode unless you run the undebug()
function.
The following is an example using debug
to inspect our vectorContains()
function:
> debug(vectorContains) > x <- c(1, 2, 3, 9) > vectorContains(x) debugging in: vectorContains(x) debug at #1: { if (is.numeric(v1) && !any(is.na(v1))) { value.found <- "no" for (i in v1) { if (i == value.to.check) { value.found <- "yes" break } } if (value.found == "yes") { return(TRUE) } else { return(FALSE) } } else { stop("This function takes a numeric vector as input.") } } Browse[2]> c exiting from: vectorContains(x) [1] TRUE > undebug(vectorContains) > vectorContains(x) [1] TRUE
Note
Notice that debug
only enters the browser
mode when you call the vectorContains
function.