Dangerous R functions and operators for programming
Colon operator
The colon operator (:
) can be very dangerous when not used carefully. For example, you may use 1:length(x)
, but what if the length of x is 0?
x <- c()
1:length(x)
## [1] 1 0
As you see, instead of a vector to match the length of x, you get a reversed vector of c(1, 0)
. It would be much better to use the seq_along()
function as follows:
x <- c()
seq_along(x)
## integer(0)
The related functions seq()
and seq_len()
are other related functions that should be used instead of the colon operator when used programmatically.
sample()
function
The sample()
function behaves differently whether it’s first input has length 1 or length greater than 1. This became a problem for me when I had a variable number of elements I needed to randomly select from.
Here we select a random element from a vector of length 3:
set.seed(1)
x <- c(2,5,7)
sample(x, size = 1)
## [1] 2
If x is chosen programmatically, then it may be of length 1.
set.seed(2)
x <- c(7)
sample(x, size = 1)
## [1] 5
See the problem? Instead of selecting the 7 like the previous example, it chose a random number from 1 to 7.
To get around this, I wrote a new function that has the behavior I was looking for:
sample_safe <- function(x, size, replace = FALSE, prob = NULL) {
if(length(x) == 1) {
return(x)
}
sample(x, size, replace, prob)
}
set.seed(1)
x <- c(2,5,7)
sample_safe(x, size = 1)
## [1] 2
set.seed(2)
x <- c(7)
sample_safe(x, size = 1)
## [1] 7
Now when x
is of length 1, only that element is returned.
Other examples
Do you know any other examples? Let me know in the comments below and I may update this post or include them in a future post.