Dangerous R functions and operators for programming

Colon operator

The colon operator (:) can be very dangerous when not used carefully. For example, you may use 1:length(x), but what if the length of x is 0?

x <- c()
1:length(x)
## [1] 1 0

As you see, instead of a vector to match the length of x, you get a reversed vector of c(1, 0). It would be much better to use the seq_along() function as follows:

x <- c()
seq_along(x)
## integer(0)

The related functions seq() and seq_len() are other related functions that should be used instead of the colon operator when used programmatically.

sample() function

The sample() function behaves differently whether it’s first input has length 1 or length greater than 1. This became a problem for me when I had a variable number of elements I needed to randomly select from.

Here we select a random element from a vector of length 3:

set.seed(1)
x <- c(2,5,7)
sample(x, size = 1)
## [1] 2

If x is chosen programmatically, then it may be of length 1.

set.seed(2)
x <- c(7)
sample(x, size = 1)
## [1] 5

See the problem? Instead of selecting the 7 like the previous example, it chose a random number from 1 to 7.

To get around this, I wrote a new function that has the behavior I was looking for:

sample_safe <- function(x, size, replace = FALSE, prob = NULL) {
  if(length(x) == 1) {
    return(x)
  }
  sample(x, size, replace, prob)
}

set.seed(1)
x <- c(2,5,7)
sample_safe(x, size = 1)
## [1] 2
set.seed(2)
x <- c(7)
sample_safe(x, size = 1)
## [1] 7

Now when x is of length 1, only that element is returned.

Other examples

Do you know any other examples? Let me know in the comments below and I may update this post or include them in a future post.

Dr. Rick Tankard
Dr. Rick Tankard
Freelance Statistician, Bioinformatician and Data Scientist

I am a Bioinformatician and Statistician based in Australia, working at WEHI.

Related