Exploring R's function call semantics

After spending my summer working with R I have become very familiar with its lesser-known features. An area which is rarely explored is how flexible function calls can be. This post will explore the various behaviours of function calls in R.

To start off we shall observe how R function calls can behave much the same as they do with most statically typed languages, where the arguments to the function must be passed in the correct position.

> f <- function(x, y, z) { print(c(x, y, z)) }
> f(2, 3, 9)
[1] 2 3 9

Alternatively, instead of providing the arguments in the position that they are defined in the function definition we can provide them by name.

> f(z = 9, x = 2, y = 3)
[1] 2 3 9

You can mix these two behaviours, by supplying parameters by position and by name. However, care must be made to ensure that the position of an unnamed parameter is correct.

> f(z = 9, 3, x = 2)
[1] 2 3 9

Function calls can also incorporate default values, this means that we can omit an argument to a function if we wish to have assign a default value to that argument. This is applied by simply performing an assignment on an argument in the function definition.

> f <- function(x, y, z = 9) { print(c(x, y, z)) }
> f(x = 2, y = 3)
[1] 2 3 9

While all of these previous examples are useful, this behaviour is fairly common (especially with dynamically typed languages), to the point that it is almost expected in any recently developed language. The following function calling features are rarely found in other languages, and if they are apparent in other languages, it’s unlikely to be quite as elegant as in R.

During the development of GeneralizedHyperbolic I made use of lazily evaluated arguments in order to ensure distribution parameters were always applied. I did this by creating a vector (analogous to an array or a list) of distribution parameters. This vector was composed of other function arguments, all of which had default values. This means that a distribution parameter vector could be passed into the function or if any arguments differed from the default, they could be passed in too. Applying this to the previous example would be the following:

> f <- function(x = 2, y = 3, z = 9, all = c(x, y, z)) {
+   print(all)
+ }
> f(z = 5)
[1] 2 3 5
> f()
[1] 2 3 9
> f(z = 4, x = 6)
[1] 6 3 4

Variadic functions are also supported in R, these are where a function can have any number of arguments. While many other languages support this, I find R to be among the most intuitive to use. This is accomplished by adding an ellipsis (…) to the function definition, which essentially holds a set of (named or unnamed) values that are not previously defined in the function definition.

> f <- function(x, ) { print(list()) }
> f(2, y = 3, 9)
$y
[1] 3

[[2]] [1] 9

The last feature of R’s function call semantics to be examined is that of partial matching. This is where any argument can be assigned a value (by name) using an argument name that is not explicitly defined in the function definition. This works so long as the amount of characters that are being assigned to a argument are sufficient to ensure the intended argument is uniquely matched.

> f <- function(alongString, alsoAString, anotherString) {
+   print(c(alongString, alsoAString, anotherString))
+ }
> f(alo = 2, als = 3, an = 9)
[1] 2 3 9

After all this examples, it’s amazing that these features would be apparent in a language designed to be used for statistical computing. R truly has some gems that are not readily apparent to most users.