Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advanced R

13 S3

13.1 Introduction

S3 is R’s first and simplest OO system. S3 is informal and ad hoc, but there is a certain elegance in its minimalism: you can’t take away any part of it and still have a useful OO system. For these reasons, you should use it, unless you have a compelling reason to do otherwise. S3 is the only OO system used in the base and stats packages, and it’s the most commonly used system in CRAN packages.

S3 is very flexible, which means it allows you to do things that are quite ill-advised. If you’re coming from a strict environment like Java this will seem pretty frightening, but it gives R programmers a tremendous amount of freedom. It may be very difficult to prevent people from doing something you don’t want them to do, but your users will never be held back because there is something you haven’t implemented yet. Since S3 has few built-in constraints, the key to its successful use is applying the constraints yourself. This chapter will therefore teach you the conventions you should (almost) always follow.

The goal of this chapter is to show you how the S3 system works, not how to use it effectively to create new classes and generics. I’d recommend coupling the theoretical knowledge from this chapter with the practical knowledge encoded in thevctrs package.

Outline

  • Section13.2 gives a rapid overview of all the main componentsof S3: classes, generics, and methods. You’ll also learn aboutsloop::s3_dispatch(), which we’ll use throughout the chapter to explorehow S3 works.

  • Section13.3 goes into the details of creating a new S3 class,including the three functions that should accompany most classes:a constructor, a helper, and a validator.

  • Section13.4 describes how S3 generics and methods work,including the basics of method dispatch.

  • Section13.5 discusses the four main styles of S3 objects:vector, record, data frame, and scalar.

  • Section13.6 demonstrates how inheritance works in S3,and shows you what you need to make a class “subclassable”.

  • Section13.7 concludes the chapter with a discussion of thefiner details of method dispatch including base types, internal generics,group generics, and double dispatch.

Prerequisites

S3 classes are implemented using attributes, so make sure you’re familiar with the details described in Section3.3. We’ll use existing base S3 vectors for examples and exploration, so make sure that you’re familiar with the factor, Date, difftime, POSIXct, and POSIXlt classes described in Section3.4.

We’ll use thesloop package for its interactive helpers.

13.2 Basics

An S3 object is a base type with at least aclass attribute (other attributes may be used to store other data). For example, take the factor. Its base type is the integer vector, it has aclass attribute of “factor”, and alevels attribute that stores the possible levels:

f<-factor(c("a","b","c"))typeof(f)#> [1] "integer"attributes(f)#> $levels#> [1] "a" "b" "c"#>#> $class#> [1] "factor"

You can get the underlying base type byunclass()ing it, which strips the class attribute, causing it to lose its special behaviour:

unclass(f)#> [1] 1 2 3#> attr(,"levels")#> [1] "a" "b" "c"

An S3 object behaves differently from its underlying base type whenever it’s passed to ageneric (short for generic function). The easiest way to tell if a function is a generic is to usesloop::ftype() and look for “generic” in the output:

ftype(print)#> [1] "S3"      "generic"ftype(str)#> [1] "S3"      "generic"ftype(unclass)#> [1] "primitive"

A generic function defines an interface, which uses a different implementation depending on the class of an argument (almost always the first argument). Many base R functions are generic, including the importantprint():

print(f)#> [1] a b c#> Levels: a b c# stripping class reverts to integer behaviourprint(unclass(f))#> [1] 1 2 3#> attr(,"levels")#> [1] "a" "b" "c"

Beware thatstr() is generic, and some S3 classes use that generic to hide the internal details. For example, thePOSIXlt class used to represent date-time data is actually built on top of a list, a fact which is hidden by itsstr() method:

time<-strptime(c("2017-01-01","2020-05-04 03:21"),"%Y-%m-%d")str(time)#>  POSIXlt[1:2], format: "2017-01-01" "2020-05-04"str(unclass(time))#> List of 11#>  $ sec   : num [1:2] 0 0#>  $ min   : int [1:2] 0 0#>  $ hour  : int [1:2] 0 0#>  $ mday  : int [1:2] 1 4#>  $ mon   : int [1:2] 0 4#>  $ year  : int [1:2] 117 120#>  $ wday  : int [1:2] 0 1#>  $ yday  : int [1:2] 0 124#>  $ isdst : int [1:2] 0 0#>  $ zone  : chr [1:2] "UTC" "UTC"#>  $ gmtoff: int [1:2] 0 0#>  - attr(*, "tzone")= chr "UTC"#>  - attr(*, "balanced")= logi TRUE

The generic is a middleman: its job is to define the interface (i.e. the arguments) then find the right implementation for the job. The implementation for a specific class is called amethod, and the generic finds that method by performingmethod dispatch.

You can usesloop::s3_dispatch() to see the process of method dispatch:

s3_dispatch(print(f))#> => print.factor#>  * print.default

We’ll come back to the details of dispatch in Section13.4.1, for now note that S3 methods are functions with a special naming scheme,generic.class(). For example, thefactor method for theprint() generic is calledprint.factor(). You should never call the method directly, but instead rely on the generic to find it for you.

Generally, you can identify a method by the presence of. in the function name, but there are a number of important functions in base R that were written before S3, and hence use. to join words. If you’re unsure, check withsloop::ftype():

ftype(t.test)#> [1] "S3"      "generic"ftype(t.data.frame)#> [1] "S3"     "method"

Unlike most functions, you can’t see the source code for most S3 methods70 just by typing their names. That’s because S3 methods are not usually exported: they live only inside the package, and are not available from the global environment. Instead, you can usesloop::s3_get_method(), which will work regardless of where the method lives:

weighted.mean.Date#> Error: object 'weighted.mean.Date' not founds3_get_method(weighted.mean.Date)#> function (x, w, ...)#> .Date(weighted.mean(unclass(x), w, ...))#> <bytecode: 0x556c24d30ab8>#> <environment: namespace:stats>

13.2.1 Exercises

  1. Describe the difference betweent.test() andt.data.frame().When is each function called?

  2. Make a list of commonly used base R functions that contain. in theirname but are not S3 methods.

  3. What does theas.data.frame.data.frame() method do? Why isit confusing? How could you avoid this confusion in your owncode?

  4. Describe the difference in behaviour in these two calls.

    set.seed(1014)some_days<-as.Date("2017-01-31")+sample(10,5)mean(some_days)#> [1] "2017-02-06"mean(unclass(some_days))#> [1] 17203
  5. What class of object does the following code return? What base type is itbuilt on? What attributes does it use?

    x<-ecdf(rpois(100,10))x#> Empirical CDF#> Call: ecdf(rpois(100, 10))#>  x[1:18] =  2,  3,  4,  ..., 2e+01, 2e+01
  6. What class of object does the following code return? What base type is itbuilt on? What attributes does it use?

    x<-table(rpois(100,5))x#>#>  1  2  3  4  5  6  7  8  9 10#>  7  5 18 14 15 15 14  4  5  3

13.3 Classes

If you have done object-oriented programming in other languages, you may be surprised to learn that S3 has no formal definition of a class: to make an object an instance of a class, you simply set theclass attribute. You can do that during creation withstructure(), or after the fact withclass<-():

# Create and assign class in one stepx<-structure(list(), class="my_class")# Create, then set classx<-list()class(x)<-"my_class"

You can determine the class of an S3 object withclass(x), and see if an object is an instance of a class usinginherits(x, "classname").

class(x)#> [1] "my_class"inherits(x,"my_class")#> [1] TRUEinherits(x,"your_class")#> [1] FALSE

The class name can be any string, but I recommend using only letters and_. Avoid. because (as mentioned earlier) it can be confused with the. separator between a generic name and a class name. When using a class in a package, I recommend including the package name in the class name. That ensures you won’t accidentally clash with a class defined by another package.

S3 has no checks for correctness which means you can change the class of existing objects:

# Create a linear modelmod<-lm(log(mpg)~log(disp), data=mtcars)class(mod)#> [1] "lm"print(mod)#>#> Call:#> lm(formula = log(mpg) ~ log(disp), data = mtcars)#>#> Coefficients:#> (Intercept)    log(disp)#>       5.381       -0.459# Turn it into a date (?!)class(mod)<-"Date"# Unsurprisingly this doesn't work very wellprint(mod)#> Error in as.POSIXlt(.Internal(Date2POSIXlt(x, tz)), tz = tz): 'list' object#> cannot be coerced to type 'double'

If you’ve used other OO languages, this might make you feel queasy, but in practice this flexibility causes few problems. R doesn’t stop you from shooting yourself in the foot, but as long as you don’t aim the gun at your toes and pull the trigger, you won’t have a problem.

To avoid foot-bullet intersections when creating your own class, I recommend that you usually provide three functions:

  • A low-levelconstructor,new_myclass(), that efficiently creates newobjects with the correct structure.

  • Avalidator,validate_myclass(), that performs more computationallyexpensive checks to ensure that the object has correct values.

  • A user-friendlyhelper,myclass(), that provides a convenient way forothers to create objects of your class.

You don’t need a validator for very simple classes, and you can skip the helper if the class is for internal use only, but you should always provide a constructor.

13.3.1 Constructors

S3 doesn’t provide a formal definition of a class, so it has no built-in way to ensure that all objects of a given class have the same structure (i.e. the same base type and the same attributes with the same types). Instead, you must enforce a consistent structure by using aconstructor.

The constructor should follow three principles:

  • Be callednew_myclass().

  • Have one argument for the base object, and one for each attribute.

  • Check the type of the base object and the types of each attribute.

I’ll illustrate these ideas by creating constructors for base classes71 that you’re already familiar with. To start, lets make a constructor for the simplest S3 class:Date. ADate is just a double with a single attribute: itsclass is “Date”. This makes for a very simple constructor:

new_Date<-function(x=double()){stopifnot(is.double(x))structure(x, class="Date")}new_Date(c(-1,0,1))#> [1] "1969-12-31" "1970-01-01" "1970-01-02"

The purpose of constructors is to help you, the developer. That means you can keep them simple, and you don’t need to optimise error messages for public consumption. If you expect users to also create objects, you should create a friendly helper function, calledclass_name(), which I’ll describe shortly.

A slightly more complicated constructor is that fordifftime, which is used to represent time differences. It is again built on a double, but has aunits attribute that must take one of a small set of values:

new_difftime<-function(x=double(),units="secs"){stopifnot(is.double(x))units<-match.arg(units,c("secs","mins","hours","days","weeks"))structure(x,    class="difftime",    units=units)}new_difftime(c(1,10,3600),"secs")#> Time differences in secs#> [1]    1   10 3600new_difftime(52,"weeks")#> Time difference of 52 weeks

The constructor is a developer function: it will be called in many places, by an experienced user. That means it’s OK to trade a little safety in return for performance, and you should avoid potentially time-consuming checks in the constructor.

13.3.2 Validators

More complicated classes require more complicated checks for validity. Take factors, for example. A constructor only checks that types are correct, making it possible to create malformed factors:

new_factor<-function(x=integer(),levels=character()){stopifnot(is.integer(x))stopifnot(is.character(levels))structure(x,    levels=levels,    class="factor")}new_factor(1:5,"a")#> Error in as.character.factor(x): malformed factornew_factor(0:1,"a")#> Error in as.character.factor(x): malformed factor

Rather than encumbering the constructor with complicated checks, it’s better to put them in a separate function. Doing so allows you to cheaply create new objects when you know that the values are correct, and easily re-use the checks in other places.

validate_factor<-function(x){values<-unclass(x)levels<-attr(x,"levels")if(!all(!is.na(values)&values>0)){stop("All `x` values must be non-missing and greater than zero",      call.=FALSE)}if(length(levels)<max(values)){stop("There must be at least as many `levels` as possible values in `x`",      call.=FALSE)}x}validate_factor(new_factor(1:5,"a"))#> Error: There must be at least as many `levels` as possible values in `x`validate_factor(new_factor(0:1,"a"))#> Error: All `x` values must be non-missing and greater than zero

This validator function is called primarily for its side-effects (throwing an error if the object is invalid) so you’d expect it to invisibly return its primary input (as described in Section6.7.2). However, it’s useful for validation methods to return visibly, as we’ll see next.

13.3.3 Helpers

If you want users to construct objects from your class, you should also provide a helper method that makes their life as easy as possible. A helper should always:

  • Have the same name as the class, e.g. myclass().

  • Finish by calling the constructor, and the validator, if it exists.

  • Create carefully crafted error messages tailored towards an end-user.

  • Have a thoughtfully crafted user interface with carefully chosen defaultvalues and useful conversions.

The last bullet is the trickiest, and it’s hard to give general advice. However, there are three common patterns:

  • Sometimes all the helper needs to do is coerce its inputs to the desiredtype. For example,new_difftime() is very strict, and violates the usualconvention that you can use an integer vector wherever you can use adouble vector:

    new_difftime(1:10)#> Error in new_difftime(1:10): is.double(x) is not TRUE

    It’s not the job of the constructor to be flexible, so here we createa helper that just coerces the input to a double.

    difftime<-function(x=double(),units="secs"){x<-as.double(x)new_difftime(x, units=units)}difftime(1:10)#> Time differences in secs#>  [1]  1  2  3  4  5  6  7  8  9 10
  • Often, the most natural representation of a complex object is a string.For example, it’s very convenient to specify factors with a charactervector. The code below shows a simple version offactor(): it takes acharacter vector, and guesses that the levels should be the unique values.This is not always correct (since some levels might not be seen in thedata), but it’s a useful default.

    factor<-function(x=character(),levels=unique(x)){ind<-match(x,levels)validate_factor(new_factor(ind,levels))}factor(c("a","a","b"))#> [1] a a b#> Levels: a b
  • Some complex objects are most naturally specified by multiple simple
    components. For example, I think it’s natural to construct a date-timeby supplying the individual components (year, month, day etc). That leadsme to thisPOSIXct() helper that resembles the existingISODatetime()function72:

    POSIXct<-function(year=integer(),month=integer(),day=integer(),hour=0L,minute=0L,sec=0,tzone=""){ISOdatetime(year,month,day,hour,minute,sec, tz=tzone)}POSIXct(2020,1,1, tzone="America/New_York")#> [1] "2020-01-01 EST"

For more complicated classes, you should feel free to go beyond these patterns to make life as easy as possible for your users.

13.3.4 Exercises

  1. Write a constructor fordata.frame objects. What base type is a dataframe built on? What attributes does it use? What are the restrictionsplaced on the individual elements? What about the names?

  2. Enhance myfactor() helper to have better behaviour when one ormorevalues is not found inlevels. What doesbase::factor() doin this situation?

  3. Carefully read the source code offactor(). What does it do thatmy constructor does not?

  4. Factors have an optional “contrasts” attribute. Read the help forC(),and briefly describe the purpose of the attribute. What type should ithave? Rewrite thenew_factor() constructor to include this attribute.

  5. Read the documentation forutils::as.roman(). How would you write aconstructor for this class? Does it need a validator? What might a helperdo?

13.4 Generics and methods

The job of an S3 generic is to perform method dispatch, i.e. find the specific implementation for a class. Method dispatch is performed byUseMethod(), which every generic calls73.UseMethod() takes two arguments: the name of the generic function (required), and the argument to use for method dispatch (optional). If you omit the second argument, it will dispatch based on the first argument, which is almost always what is desired.

Most generics are very simple, and consist of only a call toUseMethod(). Takemean() for example:

mean#> function (x, ...)#> UseMethod("mean")#> <bytecode: 0x556c203db170>#> <environment: namespace:base>

Creating your own generic is similarly simple:

my_new_generic<-function(x){UseMethod("my_new_generic")}

(If you wonder why we have to repeatmy_new_generic twice, think back to Section6.2.3.)

You don’t pass any of the arguments of the generic toUseMethod(); it uses deep magic to pass to the method automatically. The precise process is complicated and frequently surprising, so you should avoid doing any computation in a generic. To learn the full details, carefully read the Technical Details section in?UseMethod.

13.4.1 Method dispatch

How doesUseMethod() work? It basically creates a vector of method names,paste0("generic", ".", c(class(x), "default")), and then looks for each potential method in turn. We can see this in action withsloop::s3_dispatch(). You give it a call to an S3 generic, and it lists all the possible methods. For example, what method is called when you print aDate object?

x<-Sys.Date()s3_dispatch(print(x))#> => print.Date#>  * print.default

The output here is simple:

The “default” class is a specialpseudo-class. This is not a real class, but is included to make it possible to define a standard fallback that is found whenever a class-specific method is not available.

The essence of method dispatch is quite simple, but as the chapter proceeds you’ll see it get progressively more complicated to encompass inheritance, base types, internal generics, and group generics. The code below shows a couple of more complicated cases which we’ll come back to in Sections14.2.4 and13.7.

x<-matrix(1:10, nrow=2)s3_dispatch(mean(x))#>    mean.matrix#>    mean.integer#>    mean.numeric#> => mean.defaults3_dispatch(sum(Sys.time()))#>    sum.POSIXct#>    sum.POSIXt#>    sum.default#> => Summary.POSIXct#>    Summary.POSIXt#>    Summary.default#> -> sum (internal)

13.4.2 Finding methods

sloop::s3_dispatch() lets you find the specific method used for a single call. What if you want to find all methods defined for a generic or associated with a class? That’s the job ofsloop::s3_methods_generic() andsloop::s3_methods_class():

s3_methods_generic("mean")#> # A tibble: 7 × 4#>   generic class      visible source#>   <chr>   <chr>      <lgl>   <chr>#> 1 mean    Date       TRUE    base#> 2 mean    default    TRUE    base#> 3 mean    difftime   TRUE    base#> 4 mean    POSIXct    TRUE    base#> 5 mean    POSIXlt    TRUE    base#> 6 mean    quosure    FALSE   registered S3method#> 7 mean    vctrs_vctr FALSE   registered S3methods3_methods_class("ordered")#> # A tibble: 4 × 4#>   generic       class   visible source#>   <chr>         <chr>   <lgl>   <chr>#> 1 as.data.frame ordered TRUE    base#> 2 Ops           ordered TRUE    base#> 3 relevel       ordered FALSE   registered S3method#> 4 Summary       ordered TRUE    base

13.4.3 Creating methods

There are two wrinkles to be aware of when you create a new method:

  • First, you should only ever write a method if you own the generic or theclass. R will allow you to define a method even if you don’t, but it isexceedingly bad manners. Instead, work with the author of either thegeneric or the class to add the method in their code.

  • A method must have the same arguments as its generic. This is enforced inpackages byR CMD check, but it’s good practice even if you’re notcreating a package.

    There is one exception to this rule: if the generic has..., the methodcan contain a superset of the arguments. This allows methods to takearbitrary additional arguments. The downside of using..., however, isthat any misspelled arguments will be silently swallowed74,as mentioned in Section6.6.

13.4.4 Exercises

  1. Read the source code fort() andt.test() and confirm thatt.test() is an S3 generic and not an S3 method. What happens ifyou create an object with classtest and callt() with it? Why?

    x<-structure(1:10, class="test")t(x)
  2. What generics does thetable class have methods for?

  3. What generics does theecdf class have methods for?

  4. Which base generic has the greatest number of defined methods?

  5. Carefully read the documentation forUseMethod() and explain why thefollowing code returns the results that it does. What two usual rulesof function evaluation doesUseMethod() violate?

    g<-function(x){x<-10y<-10UseMethod("g")}g.default<-function(x)c(x=x, y=y)x<-1y<-1g(x)#> x y#> 1 1
  6. What are the arguments to[? Why is this a hard question to answer?

13.5 Object styles

So far I’ve focussed on vector style classes likeDate andfactor. These have the key property thatlength(x) represents the number of observations in the vector. There are three variants that do not have this property:

  • Record style objects use a list of equal-length vectors to representindividual components of the object. The best example of this isPOSIXlt,which underneath the hood is a list of 11 date-time components like year,month, and day. Record style classes overridelength() and subsettingmethods to conceal this implementation detail.

    x<-as.POSIXlt(ISOdatetime(2020,1,1,0,0,1:3))x#> [1] "2020-01-01 00:00:01 UTC" "2020-01-01 00:00:02 UTC"#> [3] "2020-01-01 00:00:03 UTC"length(x)#> [1] 3length(unclass(x))#> [1] 11x[[1]]# the first date time#> [1] "2020-01-01 00:00:01 UTC"unclass(x)[[1]]# the first component, the number of seconds#> [1] 1 2 3
  • Data frames are similar to record style objects in that both use lists ofequal length vectors. However, data frames are conceptually two dimensional,and the individual components are readily exposed to the user. The number ofobservations is the number of rows, not the length:

    x<-data.frame(x=1:100, y=1:100)length(x)#> [1] 2nrow(x)#> [1] 100
  • Scalar objects typically use a list to represent a single thing.For example, anlm object is a list of length 12 but it represents onemodel.

    mod<-lm(mpg~wt, data=mtcars)length(mod)#> [1] 12

    Scalar objects can also be built on top of functions, calls, andenvironments75. This is less generally useful, but you can seeapplications instats::ecdf(), R6 (Chapter14), andrlang::quo() (Chapter19).

Unfortunately, describing the appropriate use of each of these object styles is beyond the scope of this book. However, you can learn more from the documentation of the vctrs package (https://vctrs.r-lib.org); the package also provides constructors and helpers that make implementation of the different styles easier.

13.5.1 Exercises

  1. Categorise the objects returned bylm(),factor(),table(),as.Date(),as.POSIXct()ecdf(),ordered(),I() into thestyles described above.

  2. What would a constructor function forlm objects,new_lm(), look like?Use?lm and experimentation to figure out the required fields and theirtypes.

13.6 Inheritance

S3 classes can share behaviour through a mechanism calledinheritance. Inheritance is powered by three ideas:

  • The class can be a charactervector. For example, theordered andPOSIXct classes have two components in their class:

    class(ordered("x"))#> [1] "ordered" "factor"class(Sys.time())#> [1] "POSIXct" "POSIXt"
  • If a method is not found for the class in the first element of thevector, R looks for a method for the second class (and so on):

    s3_dispatch(print(ordered("x")))#>    print.ordered#> => print.factor#>  * print.defaults3_dispatch(print(Sys.time()))#> => print.POSIXct#>    print.POSIXt#>  * print.default
  • A method can delegate work by callingNextMethod(). We’ll come back tothat very shortly; for now, note thats3_dispatch() reports delegationwith->.

    s3_dispatch(ordered("x")[1])#>    [.ordered#> => [.factor#>    [.default#> -> [ (internal)s3_dispatch(Sys.time()[1])#> => [.POSIXct#>    [.POSIXt#>    [.default#> -> [ (internal)

Before we continue we need a bit of vocabulary to describe the relationship between the classes that appear together in a class vector. We’ll say thatordered is asubclass offactor because it always appears before it in the class vector, and, conversely, we’ll sayfactor is asuperclass ofordered.

S3 imposes no restrictions on the relationship between sub- and superclasses but your life will be easier if you impose some. I recommend that you adhere to two simple principles when creating a subclass:

  • The base type of the subclass should be that same as the superclass.

  • The attributes of the subclass should be a superset of the attributesof the superclass.

POSIXt does not adhere to these principles becausePOSIXct has type double, andPOSIXlt has type list. This means thatPOSIXt is not a superclass, and illustrates that it’s quite possible to use the S3 inheritance system to implement other styles of code sharing (herePOSIXt plays a role more like an interface), but you’ll need to figure out safe conventions yourself.

13.6.1NextMethod()

NextMethod() is the hardest part of inheritance to understand, so we’ll start with a concrete example for the most common use case:[. We’ll start by creating a simple toy class: asecret class that hides its output when printed:

new_secret<-function(x=double()){stopifnot(is.double(x))structure(x, class="secret")}print.secret<-function(x,...){print(strrep("x",nchar(x)))invisible(x)}x<-new_secret(c(15,1,456))x#> [1] "xx"  "x"   "xxx"

This works, but the default[ method doesn’t preserve the class:

s3_dispatch(x[1])#>    [.secret#>    [.default#> => [ (internal)x[1]#> [1] 15

To fix this, we need to provide a[.secret method. How could we implement this method? The naive approach won’t work because we’ll get stuck in an infinite loop:

`[.secret`<-function(x,i){new_secret(x[i])}

Instead, we need some way to call the underlying[ code, i.e. the implementation that would get called if we didn’t have a[.secret method. One approach would be tounclass() the object:

`[.secret`<-function(x,i){x<-unclass(x)new_secret(x[i])}x[1]#> [1] "xx"

This works, but is inefficient because it creates a copy ofx. A better approach is to useNextMethod(), which concisely solves the problem of delegating to the method that would have been called if[.secret didn’t exist:

`[.secret`<-function(x,i){new_secret(NextMethod())}x[1]#> [1] "xx"

We can see what’s going on withsloop::s3_dispatch():

s3_dispatch(x[1])#> => [.secret#>    [.default#> -> [ (internal)

The=> indicates that[.secret is called, but thatNextMethod() delegates work to the underlying internal[ method, as shown by the->.

As withUseMethod(), the precise semantics ofNextMethod() are complex. In particular, it tracks the list of potential next methods with a special variable, which means that modifying the object that’s being dispatched upon will have no impact on which method gets called next.

13.6.2 Allowing subclassing

When you create a class, you need to decide if you want to allow subclasses, because it requires some changes to the constructor and careful thought in your methods.

To allow subclasses, the parent constructor needs to have... andclass arguments:

new_secret<-function(x,...,class=character()){stopifnot(is.double(x))structure(x,...,    class=c(class,"secret"))}

Then the subclass constructor can just call to the parent class constructor with additional arguments as needed. For example, imagine we want to create a supersecret class which also hides the number of characters:

new_supersecret<-function(x){new_secret(x, class="supersecret")}print.supersecret<-function(x,...){print(rep("xxxxx",length(x)))invisible(x)}x2<-new_supersecret(c(15,1,456))x2#> [1] "xxxxx" "xxxxx" "xxxxx"

To allow inheritance, you also need to think carefully about your methods, as you can no longer use the constructor. If you do, the method will always return the same class, regardless of the input. This forces whoever makes a subclass to do a lot of extra work.

Concretely, this means we need to revise the[.secret method. Currently it always returns asecret(), even when given a supersecret:

`[.secret`<-function(x,...){new_secret(NextMethod())}x2[1:3]#> [1] "xx"  "x"   "xxx"

We want to make sure that[.secret returns the same class asx even if it’s a subclass. As far as I can tell, there is no way to solve this problem using base R alone. Instead, you’ll need to use the vctrs package, which provides a solution in the form of thevctrs::vec_restore() generic. This generic takes two inputs: an object which has lost subclass information, and a template object to use for restoration.

Typicallyvec_restore() methods are quite simple: you just call the constructor with appropriate arguments:

vec_restore.secret<-function(x,to,...)new_secret(x)vec_restore.supersecret<-function(x,to,...)new_supersecret(x)

(If your class has attributes, you’ll need to pass them fromto into the constructor.)

Now we can usevec_restore() in the[.secret method:

`[.secret`<-function(x,...){vctrs::vec_restore(NextMethod(),x)}x2[1:3]#> [1] "xxxxx" "xxxxx" "xxxxx"

(I only fully understood this issue quite recently, so at time of writing it is not used in the tidyverse. Hopefully by the time you’re reading this, it will have rolled out, making it much easier to (e.g.) subclass tibbles.)

If you build your class using the tools provided by the vctrs package,[ will gain this behaviour automatically. You will only need to provide your own[ method if you use attributes that depend on the data or want non-standard subsetting behaviour. See?vctrs::new_vctr for details.

13.6.3 Exercises

  1. How does[.Date support subclasses? How does it fail to supportsubclasses?

  2. R has two classes for representing date time data,POSIXct andPOSIXlt, which both inherit fromPOSIXt. Which generics havedifferent behaviours for the two classes? Which generics share the samebehaviour?

  3. What do you expect this code to return? What does it actually return?Why?

    generic2<-function(x)UseMethod("generic2")generic2.a1<-function(x)"a1"generic2.a2<-function(x)"a2"generic2.b<-function(x){class(x)<-"a1"NextMethod()}generic2(structure(list(), class=c("b","a2")))

13.7 Dispatch details

This chapter concludes with a few additional details about method dispatch. It is safe to skip these details if you’re new to S3.

13.7.1 S3 and base types

What happens when you call an S3 generic with a base object, i.e. an object with no class? You might think it would dispatch on whatclass() returns:

class(matrix(1:5))#> [1] "matrix" "array"

But unfortunately dispatch actually occurs on theimplicit class, which has three components:

  • The string “array” or “matrix” if the object has dimensions
  • The result oftypeof() with a few minor tweaks
  • The string “numeric” if object is “integer” or “double”

There is no base function that will compute the implicit class, but you can usesloop::s3_class()

s3_class(matrix(1:5))#> [1] "matrix"  "integer" "numeric"

This is used bys3_dispatch():

s3_dispatch(print(matrix(1:5)))#>    print.matrix#>    print.integer#>    print.numeric#> => print.default

This means that theclass() of an object does not uniquely determine its dispatch:

x1<-1:5class(x1)#> [1] "integer"s3_dispatch(mean(x1))#>    mean.integer#>    mean.numeric#> => mean.defaultx2<-structure(x1, class="integer")class(x2)#> [1] "integer"s3_dispatch(mean(x2))#>    mean.integer#> => mean.default

13.7.2 Internal generics

Some base functions, like[,sum(), andcbind(), are calledinternal generics because they don’t callUseMethod() but instead call the C functionsDispatchGroup() orDispatchOrEval().s3_dispatch() shows internal generics by including the name of the generic followed by(internal):

s3_dispatch(Sys.time()[1])#> => [.POSIXct#>    [.POSIXt#>    [.default#> -> [ (internal)

For performance reasons, internal generics do not dispatch to methods unless the class attribute has been set, which means that internal generics do not use the implicit class. Again, if you’re ever confused about method dispatch, you can rely ons3_dispatch().

13.7.3 Group generics

Group generics are the most complicated part of S3 method dispatch because they involve bothNextMethod() and internal generics. Like internal generics, they only exist in base R, and you cannot define your own group generic.

There are four group generics:

Defining a single group generic for your class overrides the default behaviour for all of the members of the group. Methods for group generics are looked for only if the methods for the specific generic do not exist:

s3_dispatch(sum(Sys.time()))#>    sum.POSIXct#>    sum.POSIXt#>    sum.default#> => Summary.POSIXct#>    Summary.POSIXt#>    Summary.default#> -> sum (internal)

Most group generics involve a call toNextMethod(). For example, takedifftime() objects. If you look at the method dispatch forabs(), you’ll see there’s aMath group generic defined.

y<-as.difftime(10, units="mins")s3_dispatch(abs(y))#>    abs.difftime#>    abs.default#> => Math.difftime#>    Math.default#> -> abs (internal)

Math.difftime basically looks like this:

Math.difftime<-function(x,...){new_difftime(NextMethod(), units=attr(x,"units"))}

It dispatches to the next method, here the internal default, to perform the actual computation, then restore the class and attributes. (To better support subclasses ofdifftime this would need to callvec_restore(), as described in Section13.6.2.)

Inside a group generic function a special variable.Generic provides the actual generic function called. This can be useful when producing error messages, and can sometimes be useful if you need to manually re-call the generic with different arguments.

13.7.4 Double dispatch

Generics in the Ops group, which includes the two-argument arithmetic and Boolean operators like- and&, implement a special type of method dispatch. They dispatch on the type ofboth of the arguments, which is calleddouble dispatch. This is necessary to preserve the commutative property of many operators, i.e. a + b should equalb + a. Take the following simple example:

date<-as.Date("2017-01-01")integer<-1Ldate+integer#> [1] "2017-01-02"integer+date#> [1] "2017-01-02"

If+ dispatched only on the first argument, it would return different values for the two cases. To overcome this problem, generics in the Ops group use a slightly different strategy from usual. Rather than doing a single method dispatch, they do two, one for each input. There are three possible outcomes of this lookup:

  • The methods are the same, so it doesn’t matter which method is used.

  • The methods are different, and R falls back to the internal method witha warning.

  • One method is internal, in which case R calls the other method.

This approach is error prone so if you want to implement robust double dispatch for algebraic operators, I recommend using the vctrs package. See?vctrs::vec_arith for details.

13.7.5 Exercises

  1. Explain the differences in dispatch below:

    length.integer<-function(x)10x1<-1:5class(x1)#> [1] "integer"s3_dispatch(length(x1))#>  * length.integer#>    length.numeric#>    length.default#> => length (internal)x2<-structure(x1, class="integer")class(x2)#> [1] "integer"s3_dispatch(length(x2))#> => length.integer#>    length.default#>  * length (internal)
  2. What classes have a method for theMath group generic in base R? Readthe source code. How do the methods work?

  3. Math.difftime() is more complicated than I described. Why?

12 Base types
14 R6

[8]ページ先頭

©2009-2025 Movatter.jp