The R6 package provides an implementation of encapsulated object-oriented programming for R (also sometimes referred to as classical object-oriented programming). It is similar to R’s reference classes, but it is more efficient and doesn’t depend on S4 classes and the methods package.
R6 classes
R6 classes are similar to R’s reference classes, but are lighter
weight, and avoid some issues that come along with using S4 (R’s
reference classes are based on S4). For more information about speed and
memory footprint, see vignette("Performance")
.
Unlike many objects in R, instances (objects) of R6 classes have reference semantics. R6 classes also support:
- public and private methods
- active bindings
- inheritance (superclasses) which works across packages
Why the name R6? When R’s reference classes were introduced, some users, following the names of R’s existing class systems S3 and S4, called the new class system R5 in jest. Although reference classes are not actually called R5, the name of this package and its classes takes inspiration from that name.
The name R5 was also a code-name used for a different object system started by Simon Urbanek, meant to solve some issues with S4 relating to syntax and performance. However, the R5 branch was shelved after a little development, and it was never released.
Basics
Here’s how to create a simple R6 class. The public
argument is a list of items, which can be functions and fields
(non-functions). Functions will be used as methods.
library(R6)
Person <- R6Class("Person",
public = list(
name = NULL,
hair = NULL,
initialize = function(name = NA, hair = NA) {
self$name <- name
self$hair <- hair
self$greet()
},
set_hair = function(val) {
self$hair <- val
},
greet = function() {
cat(paste0("Hello, my name is ", self$name, ".\n"))
}
)
)
To instantiate an object of this class, use $new()
:
ann <- Person$new("Ann", "black")
#> Hello, my name is Ann.
ann
#> <Person>
#> Public:
#> clone: function (deep = FALSE)
#> greet: function ()
#> hair: black
#> initialize: function (name = NA, hair = NA)
#> name: Ann
#> set_hair: function (val)
The $new()
method creates the object and calls the
initialize()
method, if it exists.
Inside class methods, self
refers to the object. Public
members of the object (all you’ve seen so far) are accessed with
self$x
, and assignment is done with
self$x <- y
.
Once the object is instantiated, you can access public fields and
methods with $
:
ann$hair
#> [1] "black"
ann$greet()
#> Hello, my name is Ann.
ann$set_hair("red")
ann$hair
#> [1] "red"
Implementation note: The external face of an R6 object is basically
an environment with the public members in it. This is also known as the
public environment. An R6 object’s methods have a separate
enclosing environment which, roughly speaking, is the
environment they “run in”. This is where self
binding is
found, and it is simply a reference back to the public environment.
Private members
In the previous example, all the members were public. It’s also possible to add private members:
Queue <- R6Class("Queue",
public = list(
initialize = function(...) {
for (item in list(...)) {
self$add(item)
}
},
add = function(x) {
private$queue <- c(private$queue, list(x))
invisible(self)
},
remove = function() {
if (private$length() == 0) return(NULL)
# Can use private$queue for explicit access
head <- private$queue[[1]]
private$queue <- private$queue[-1]
head
}
),
private = list(
queue = list(),
length = function() base::length(private$queue)
)
)
q <- Queue$new(5, 6, "foo")
Whereas public members are accessed with self
, like
self$add()
, private members are accessed with
private
, like private$queue
.
The public members can be accessed as usual:
# Add and remove items
q$add("something")
q$add("another thing")
q$add(17)
q$remove()
#> [1] 5
q$remove()
#> [1] 6
However, private members can’t be accessed directly:
q$queue
#> NULL
q$length()
#> Error: attempt to apply non-function
A useful design pattern is for methods to return self
(invisibly) when possible, because it makes them chainable. For example,
the add()
method returns self
so you can chain
them together:
q$add(10)$add(11)$add(12)
On the other hand, remove()
returns the value removed,
so it’s not chainable:
q$remove()
#> [1] "foo"
q$remove()
#> [1] "something"
q$remove()
#> [1] "another thing"
q$remove()
#> [1] 17
Active bindings
Active bindings look like fields, but each time they are accessed, they call a function. They are always publicly visible.
Numbers <- R6Class("Numbers",
public = list(
x = 100
),
active = list(
x2 = function(value) {
if (missing(value)) return(self$x * 2)
else self$x <- value/2
},
rand = function() rnorm(1)
)
)
n <- Numbers$new()
n$x
#> [1] 100
When an active binding is accessed as if reading a value, it calls
the function with value
as a missing argument:
n$x2
#> [1] 200
When it’s accessed as if assigning a value, it uses the assignment
value as the value
argument:
n$x2 <- 1000
n$x
#> [1] 500
If the function takes no arguments, it’s not possible to use it with
<-
:
n$rand
#> [1] 0.2648
n$rand
#> [1] 2.171
n$rand <- 3
#> Error: unused argument (quote(3))
Implementation note: Active bindings are bound in the public environment. The enclosing environment for these functions is also the public environment.
Inheritance
One R6 class can inherit from another. In other words, you can have super- and sub-classes.
Subclasses can have additional methods, and they can also have
methods that override the superclass methods. In this example of a queue
that retains its history, we’ll add a show()
method and
override the remove()
method:
# Note that this isn't very efficient - it's just for illustrating inheritance.
HistoryQueue <- R6Class("HistoryQueue",
inherit = Queue,
public = list(
show = function() {
cat("Next item is at index", private$head_idx + 1, "\n")
for (i in seq_along(private$queue)) {
cat(i, ": ", private$queue[[i]], "\n", sep = "")
}
},
remove = function() {
if (private$length() - private$head_idx == 0) return(NULL)
private$head_idx <- private$head_idx + 1
private$queue[[private$head_idx]]
}
),
private = list(
head_idx = 0
)
)
hq <- HistoryQueue$new(5, 6, "foo")
hq$show()
#> Next item is at index 1
#> 1: 5
#> 2: 6
#> 3: foo
hq$remove()
#> [1] 5
hq$show()
#> Next item is at index 2
#> 1: 5
#> 2: 6
#> 3: foo
hq$remove()
#> [1] 6
Superclass methods can be called with super$xx()
. The
CountingQueue
(example below) keeps a count of the total
number of objects that have ever been added to the queue. It does this
by overriding the add()
method – it increments a counter
and then calls the superclass’s add()
method, with
super$add(x)
:
CountingQueue <- R6Class("CountingQueue",
inherit = Queue,
public = list(
add = function(x) {
private$total <- private$total + 1
super$add(x)
},
get_total = function() private$total
),
private = list(
total = 0
)
)
cq <- CountingQueue$new("x", "y")
cq$get_total()
#> [1] 2
cq$add("z")
cq$remove()
#> [1] "x"
cq$remove()
#> [1] "y"
cq$get_total()
#> [1] 3
Note that, unlike classical OOP in other languages (e.g. C++), R6 subclasses also have access to the private methods in superclass.
For instance, in the following example, the Duck
class
has a private method $quack()
, but its subclass
Mallard
can access it using super$quack()
.
Fields containing reference objects
If your R6 class contains any fields that also have reference
semantics (e.g., other R6 objects, environments), those fields should be
populated in the $initialize()
method. If the field is set
to the reference object directly in the class definition, that object
will be shared across all instances of that R6 class.
Here’s an example:
SimpleClass <- R6Class("SimpleClass",
public = list(x = NULL)
)
SharedField <- R6Class("SharedField",
public = list(
e = SimpleClass$new()
)
)
s1 <- SharedField$new()
s1$e$x <- 1
s2 <- SharedField$new()
s2$e$x <- 2
# Changing s2$e$x has changed the value of s1$e$x
s1$e$x
#> [1] 2
To avoid this, populate the field in the initialize
method:
Other topics
Adding members to an existing class
It is sometimes useful to add members to a class after the class has
already been created. This can be done using the $set()
method on the generator object.
Simple <- R6Class("Simple",
public = list(
x = 1,
getx = function() self$x
)
)
Simple$set("public", "getx2", function() self$x*2)
# To replace an existing member, use `overwrite=TRUE`
Simple$set("public", "x", 10, overwrite = TRUE)
s <- Simple$new()
s$x
#> [1] 10
s$getx2()
#> [1] 20
The new members will be present only in instances that are created
after $set()
has been called.
To prevent modification of a class, you can use
lock_class=TRUE
when creating the class. You can also lock
and unlock a class as follows:
# Create a locked class
Simple <- R6Class("Simple",
public = list(
x = 1,
getx = function() self$x
),
lock_class = TRUE
)
# This would result in an error
Simple$set("public", "y", 2)
#> Error in Simple$set("public", "y", 2): Can't modify a locked R6 class.
# Unlock the class
Simple$unlock()
# Now it works
Simple$set("public", "y", 2)
# Lock the class again
Simple$lock()
Cloning objects
By default, R6 objects have a method named $clone()
for
making a copy of the object.
Simple <- R6Class("Simple",
public = list(
x = 1,
getx = function() self$x
)
)
s <- Simple$new()
# Create a clone
s1 <- s$clone()
# Modify it
s1$x <- 2
s1$getx()
#> [1] 2
# Original is unaffected by changes to the clone
s$getx()
#> [1] 1
If you don’t want a clone
method to be added, you can
use cloneable=FALSE
when creating the class. If any loaded
R6 object has a clone
method, that function uses 84.11 kB
bytes, but for each additional object, the clone
method
costs a trivial amount of space (112 B bytes).
Note about cloneable
and inheritance: if a superclass
has cloneable=FALSE
, then subclasses will not be cloneable.
(If a subclass sets cloneable=TRUE
, then a message will be
printed when objects are instantiated and the objects will not be
cloneable.)
Deep cloning
If there are any fields which are objects with reference semantics (environments, R6 objects, reference class objects), the copy will get a reference to the same object. This is sometimes desirable, but often it is not.
For example, we’ll create an object c1
which contains
another R6 object, s
, and then clone it. Because the
original’s and the clone’s s
fields both refer to the same
object, modifying it from one results in a change that is reflected in
the other.
Simple <- R6Class("Simple", public = list(x = 1))
Cloneable <- R6Class("Cloneable",
public = list(
s = NULL,
initialize = function() self$s <- Simple$new()
)
)
c1 <- Cloneable$new()
c2 <- c1$clone()
# Change c1's `s` field
c1$s$x <- 2
# c2's `s` is the same object, so it reflects the change
c2$s$x
#> [1] 2
To make it so the clone receives a copy of s
,
we can use the deep=TRUE
option:
c3 <- c1$clone(deep = TRUE)
# Change c1's `s` field
c1$s$x <- 3
# c2's `s` is different
c3$s$x
#> [1] 2
The default behavior of $clone(deep=TRUE)
is to copy
fields which are R6 objects, but not copy fields which are environments,
reference class objects, or other data structures which contain other
reference-type objects (for example, a list with an R6 object).
If your R6 object contains these types of objects and you want to
make a deep clone of them, you must provide your own function for deep
cloning, in a private method named deep_clone
. Below is an
example of an R6 object with two fields, a
and
b
, both of which which are environments, and both of which
contain a value x
. It also has a field v
which
is a regular (non-reference) value, and a private
deep_clone
method.
The deep_clone
method is called once for each field. It
is passed the name and value of the field, and the value it returns is
used in the clone.
CloneEnv <- R6Class("CloneEnv",
public = list(
a = NULL,
b = NULL,
v = 1,
initialize = function() {
self$a <- new.env(parent = emptyenv())
self$b <- new.env(parent = emptyenv())
self$a$x <- 1
self$b$x <- 1
}
),
private = list(
deep_clone = function(name, value) {
# With x$clone(deep=TRUE) is called, the deep_clone gets invoked once for
# each field, with the name and value.
if (name == "a") {
# `a` is an environment, so use this quick way of copying
list2env(as.list.environment(value, all.names = TRUE),
parent = emptyenv())
} else {
# For all other fields, just return the value
value
}
}
)
)
c1 <- CloneEnv$new()
c2 <- c1$clone(deep = TRUE)
When c1$clone(deep=TRUE)
is called, the
deep_clone
method is called for each field in
c1
, and is passed the name of the field and value. In our
version, the a
environment gets copied, but b
does not, nor does v
(but that doesn’t matter since
v
is not a reference object). We can test out the
clone:
# Modifying c1$a doesn't affect c2$a, because they're separate objects
c1$a$x <- 2
c2$a$x
#> [1] 1
# Modifying c1$b does affect c2$b, because they're the same object
c1$b$x <- 3
c2$b$x
#> [1] 3
# Modifying c1$v doesn't affect c2$v, because they're not reference objects
c1$v <- 4
c2$v
#> [1] 1
In the example deep_clone
method above, we checked the
name of each field to determine what to do with it, but we could also
check the value, by using inherits(value, "R6")
, or
is.environment()
, and so on.
Printing R6 objects to the screen
R6 objects have a default $print()
method that lists all
members of the object. If a class defines a $print()
method, then it overrides the default one.
PrettyCountingQueue <- R6Class("PrettyCountingQueue",
inherit = CountingQueue,
public = list(
print = function(...) {
cat("<PrettyCountingQueue> of ", self$get_total(), " elements\n", sep = "")
}
)
)
pq <- PrettyCountingQueue$new(1, 2, "foobar")
pq
#> <PrettyCountingQueue> of 3 elements
Finalizers
Sometimes it’s useful to run a function when the object is garbage
collected. For example, you may want to make sure a file or database
connection gets closed. To do this, you can define a private
$finalize()
method, which will be called with no arguments
when the object is garbage collected.
A <- R6Class("A", private = list(
finalize = function() {
print("Finalizer has been called!")
}
))
# Instantiate an object:
obj <- A$new()
# Remove the single existing reference to it, and force garbage collection
# (normally garbage collection will happen automatically from time
# to time)
rm(obj); gc()
#> [1] "Finalizer has been called!"
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 705715 37.7 1366307 73 1366307 73.0
#> Vcells 1320836 10.1 8388608 64 3579414 27.4
NOTE: In R6 version 2.3.0 (released 2018-10-04) and
below, the $finalize()
method could only be public. As of
version 2.4.0, it can be public or private. Private is preferable
because there’s no reason a finalizer needs to be publicly accessible.
If you use a private finalizer in an R package, you should set your R6
version dependency to R6 (>= 2.4.0)
. In 2.6.0, R6 prints
a message if a finalizer is public. In a future version, it will emit
warnings, and finally, it will no longer support public finalizers.
Finalizers are implemented using the reg.finalizer()
function, and they set onexit=TRUE
, so that the finalizer
will also be called when R exits. This is useful in some cases, like
database connections.
Class methods vs. member functions
When an R6 class definition contains functions in the public or
private sections, those functions are class methods: they can
access self
(as well as private
and
super
when available). When an R6 object is cloned, the
resulting object’s methods will have a self
that refers to
the new object. This works by changing the enclosing
environment of the method in the cloned object.
In contrast to class methods, you can also add regular functions as
members of an R6 object. This can be done by assigning the function to a
field in the $initialize()
method, or after the object has
been instantiated. These functions are not class methods, and they will
not have access to self
, private
, or
super
.
Here is a trivial class which has a method get_self()
that simply returns self
, as well as an empty member,
fn
. In this example, we’ll assign a function to
fn
which has the same body as get_self
.
However, since it’s a regular function, self
will refer to
something other than the R6 object:
FunctionWrapper <- R6Class("FunctionWrapper",
public = list(
get_self = function() {
self
},
fn = NULL
)
)
a <- FunctionWrapper$new()
# Create a function that accesses a variable named `self`.
# Note that `self` in this function's scope refers to 100, not to the R6 object.
self <- 100
a$fn <- function() {
self
}
a$get_self()
#> <FunctionWrapper>
#> Public:
#> clone: function (deep = FALSE)
#> fn: function ()
#> get_self: function ()
a$fn()
#> [1] 100
As of R6 2.3.0, if the object is cloned, member (non-method) functions will not have their enclosing environment changed, which is what one would normally expect. It will behave this way:
b <- a$clone()
b$get_self()
#> <FunctionWrapper>
#> Public:
#> clone: function (deep = FALSE)
#> fn: function ()
#> get_self: function ()
b$fn()
#> [1] 100