Movatterモバイル変換

This vignette explains how developers may incorporateemmeans support in their packages. If you are a userlooking for a quick way to obtain results for an unsupported model, youare probably better off trying to use theqdrg()function.

Index of all vignette topics

Introduction

Suppose you want to useemmeans for some type ofmodel that it doesn’t (yet) support. Or, suppose you have developed anew package with a fancy model-fitting function, and you’d like it towork withemmeans. What can you do? Well, there is hopebecauseemmeans is designed to be extended.

The first thing to do is to look at the help page for extending thepackage:

help("extending-emmeans", package="emmeans")

It gives details about the fact that you need to write two S3methods,recover_data andemm_basis, for theclass of object that your model-fitting function returns. Therecover_data method is needed to recreate the dataset sothat the reference grid can be identified. Theemm_basismethod then determines the linear functions needed to evaluate eachpoint in the reference grid and to obtain associated information—such asthe variance-covariance matrix—needed to do estimation and testing.

These methods must also be exported from your package so that theyare available to users. See the section onexporting the methods for details andsuggestions.

This vignette presents an example where suitable methods aredeveloped, and discusses a few issues that arise.

Back to Contents

Data example

TheMASS package contains various functions that dorobust or outlier-resistant model fitting. We will cobble together someemmeans support for these. But first, let’s create asuitable dataset (a simulated two-factor experiment) for testing.

fake = expand.grid(rep = 1:5, A = c("a1","a2"), B = c("b1","b2","b3"))fake$y = c(11.46,12.93,11.87,11.01,11.92,17.80,13.41,13.96,14.27,15.82,           23.14,23.75,-2.09,28.43,23.01,24.11,25.51,24.11,23.95,30.37,           17.75,18.28,17.82,18.52,16.33,20.58,20.55,20.77,21.21,20.10)

They values were generated using predetermined meansand Cauchy-distributed errors. There are some serious outliers in thesedata.

Supporting`rlm`

TheMASS package provides anrlmfunction that fits robust-regression models usingM estimation.We’ll fit a model using the default settings for all tuningparameters:

library(MASS)fake.rlm = rlm(y ~ A * B, data = fake)library(emmeans)emmeans(fake.rlm, ~ B | A)

## A = a1:##  B  emmean    SE df asymp.LCL asymp.UCL##  b1   11.8 0.477 NA      10.9      12.8##  b2   23.3 0.477 NA      22.4      24.2##  b3   17.8 0.477 NA      16.9      18.7## ## A = a2:##  B  emmean    SE df asymp.LCL asymp.UCL##  b1   14.7 0.477 NA      13.7      15.6##  b2   24.7 0.477 NA      23.8      25.6##  b3   20.6 0.477 NA      19.7      21.6## ## Confidence level used: 0.95

The first lesson to learn about extendingemmeans isthat sometimes, it already works! It works here becauserlmobjects inherit fromlm, which is supported by theemmeans package, andrlm objects aren’tenough different to create any problems.

Back to Contents

Quick and dirty support

Later, we will talk about how to fully support a model object inemmeans. But it is often very easy to providepartial support via theqdrg() (“quick and dirtyreference grid”) function.qdrg() creates a reference gridfor a model, and that reference grid can subsequently be used in a calltoemmeans() or many other functions (but notemtrends()). In the past,qdrg() was just astandalone function; but now, it can dispath S3 methods based on theclass of itsobject argument. Consider for example a modelfitted in therobmixglm package:

require(robmixglm, quietly = TRUE)

## Registered S3 method overwritten by 'robmixglm':##   method            from##   print.outlierTest car

fit <- robmixglm(inverse(conc) ~ source + factor(percent), family = "gamma",                 data = pigs, cores = 1)

Examining the model objectfmx, it is not at all obvioushow to get the needed parameters forqdrg(); but they areburied in there. Here is an S3 method that supports this modelclass:

qdrg.robmixglm <- function(object, data = eval(object$call$data), ...) {    coef <- coef(object)    idx <- seq_along(coef)    qdrg(formula = formula(object), data = data, coef = coef,         vcov = object$fit@vcov[idx, idx, drop = FALSE], ...)}

In this method, thevcov matrix includes covariances forsome extra parameters besides the regression coefficients, so we have touse a subset of that matrix. Note thatqdrg() functions asa standalone function as long as itsobject argument ismissing; and a typicalqdrg method will just determine whatarguments to pass to this non-genericqdrg(). Let’s get thereference grid for a selection ofbp values:

rg <- qdrg(object = fit, link = "log")emmeans(rg, ~ ., type = "response")

## $`emmeans of source`##  source response    SE  df asymp.LCL asymp.UCL##  fish       29.7 0.964 Inf      27.9      31.7##  soy        39.0 1.290 Inf      36.6      41.7##  skim       44.1 1.550 Inf      41.1      47.2## ## Results are averaged over the levels of: percent ## Confidence level used: 0.95 ## Intervals are back-transformed from the inverse[log] scale ## ## $`emmeans of percent`##  percent response   SE  df asymp.LCL asymp.UCL##        9     31.2 1.13 Inf      29.0      33.4##       12     37.4 1.27 Inf      35.0      40.0##       15     38.7 1.50 Inf      35.9      41.7##       18     42.1 1.98 Inf      38.4      46.2## ## Results are averaged over the levels of: source ## Confidence level used: 0.95 ## Intervals are back-transformed from the inverse[log] scale

Here, thelink argument got passed via...toqdrg(); It was needed since I don’t immediately see howto get the link function from the object, though I imagine there is away.

Be careful when using these methods; sinceobject is notthe first argument of the genericqdrg() function, it isbest to specifyobject = in the call to this method, thoughthe code tries to work around this.

Package developers may provide minimal emmeans support by providing aqdrg method like this. If you do this in your package, youshouldexport and register the method, and addemmeans (>=2.0.0) andestimability to thepackage’sSuggests list.

Back to Contents

Supporting`lqs` objects

TheMASS resistant-regression functionslqs,lmsreg, andltsreg areanother story, however. They createlqs objects that arenot extensions of any other class, and have other issues, including noteven having avcov method. So for these, we really do needto write new methods forlqs objects. First, let’s fit amodel.

fake.lts = ltsreg(y ~ A * B, data = fake)

The`recover_data` method

It is usually an easy matter to write arecover_datamethod. Look at the one forlm objects:

emmeans:::recover_data.lm

## function (object, frame = object$model, ...) ## {##     fcall = object$call##     recover_data(fcall, delete.response(terms(object)), object$na.action, ##         frame = frame, pwts = weights(object), ...)## }## <bytecode: 0x7fb8407b3258>## <environment: namespace:emmeans>

Note that all it does is obtain thecall component andcall the method for classcall, with additional argumentsfor itsterms component andna.action. Ithappens that we can access these attributes in exactly the same way asforlm objects; so:

recover_data.lqs = emmeans:::recover_data.lm

Let’s test it:

rec.fake = recover_data(fake.lts)head(rec.fake)

##    A  B## 1 a1 b1## 2 a1 b1## 3 a1 b1## 4 a1 b1## 5 a1 b1## 6 a2 b1

Our recovered data excludes the response variabley(owing to thedelete.response call), and this is fine.

Special arguments

By the way, there are two special argumentsdata andparams that may be handed torecover_data viaref_grid oremmeans or a related function; andyou may need to provide for if you don’t use therecover_data.call function. Thedata argumentis needed to cover a desperate situation that occurs with certain kindsof models where the underlying data information is not saved with theobject—e.g., models that are fitted by iteratively modifying the data.In those cases, the only way to recover the data is to for the user togive it explicitly, andrecover_data just adds a few neededattributes to it.

Theparams argument is needed when the model formularefers to variables besides predictors. For example, a model may includea spline term, and the knots are saved in the user’s environment as avector and referred to in the call to fit the model. In trying torecover the data, we try to construct a data frame containing all thevariables present on the right-hand side of the model, but if some ofthose are scalars or of different lengths than the number ofobservations, an error occurs. So you need to exclude any names inparams when reconstructing the data.

Many model objects contain the model frame as a slot; for example, amodel fitted withlm(..., model = TRUE) has a member$model containing the model frame. This can be useful forrecovering the data, provided none of the predictors are transformed(when predictors are transformed, the original predictor values are notin the model frame so it’s harder to recover them). Therefore, when themodel frame is available in the model object, it should be provided intheframe argument ofrecover_data.call();then whendata = NULL, a check is made ontrms, and if it has no function calls, thendata is set toframe. Of course, in the rarercase where the original data are available in the model object, specifythat asdata.

Error handling

If you check for any error conditions inrecover_data,simply have it return a character string with the desired message,rather than invokingstop. This provides a cleaner exit.The reason is that wheneverrecover_data throws an error,an informative message suggesting thatdata orparams be provided is displayed. But a character returnvalue is tested for and throws a different error with your string as themessage.

The`emm_basis` method

Theemm_basis method has four required arguments:

args(emmeans:::emm_basis.lm)

## function (object, trms, xlev, grid, ...) ## NULL

These are, respectively, the model object, itstermscomponent (at least for the right-hand side of the model), alist of levels of the factors, and the grid of predictorcombinations that specify the reference grid.

The function must obtain six things and return them in a namedlist. They are the matrixX of linearfunctions for each point in the reference grid, the regressioncoefficientsbhat; the variance-covariance matrixV; a matrixnbasis for non-estimablefunctions; a functiondffun(k,dfargs) for computing degreesof freedom for the linear functionsum(k*bhat); and a listdfargs of arguments to pass todffun.Optionally, the returned list may include amodel.matrixelement (the model matrix for the data or a compact version thereofobtained via.cmpMM()), which, if included, enables thesubmodel option.

To write your ownemm_basis function, examining some ofthe existing methods can help; but the best resource is thepredict method for the object in question, lookingcarefully to see what it does to predict values for a new set ofpredictors (e.g.,newdata inpredict.lm).Following this advice, let’s take a look at it:

MASS:::predict.lqs

## function (object, newdata, na.action = na.pass, ...) ## {##     if (missing(newdata)) ##         return(fitted(object))##     Terms <- delete.response(terms(object))##     m <- model.frame(Terms, newdata, na.action = na.action, xlev = object$xlevels)##     if (!is.null(cl <- attr(Terms, "dataClasses"))) ##         .checkMFClasses(cl, m)##     X <- model.matrix(Terms, m, contrasts.arg = object$contrasts)##     drop(X %*% object$coefficients)## }## <bytecode: 0x7fb823f672b0>## <environment: namespace:MASS>

Based on this, here is a listing of anemm_basis methodforlqs objects:

emm_basis.lqs = function(object, trms, xlev, grid, ...) {     m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)    X = model.matrix(trms, m, contrasts.arg = object$contrasts)     bhat = coef(object)     Xmat = model.matrix(trms, data=object$model)                      # 5    V = rev(object$scale)[1]^2 * solve(t(Xmat) %*% Xmat)    nbasis = matrix(NA)     dfargs = list(df = nrow(Xmat) - ncol(Xmat))    dffun = function(k, dfargs) dfargs$df    list(X = X, bhat = bhat, nbasis = nbasis, V = V,                  #10         dffun = dffun, dfargs = dfargs)}

Before explaining it, let’s verify that it works:

emmeans(fake.lts, ~ B | A)

## A = a1:##  B  emmean    SE df lower.CL upper.CL##  b1   11.9 0.238 24     11.5     12.4##  b2   23.2 0.238 24     22.7     23.7##  b3   17.8 0.238 24     17.4     18.3## ## A = a2:##  B  emmean    SE df lower.CL upper.CL##  b1   14.3 0.238 24     13.8     14.8##  b2   24.0 0.238 24     23.5     24.5##  b3   20.8 0.238 24     20.3     21.3## ## Confidence level used: 0.95

Hooray! Note the results are comparable to those we had forfake.rlm, albeit the standard errors are quite a bitsmaller. (In fact, the SEs could be misleading; a better method forestimating covariances should probably be implemented, but that isbeyond the scope of this vignette.)

Back to Contents

Dissecting`emm_basis.lqs`

Let’s go through the listing of this method, line-by-line:

Lines 2–3: Construct the linear functions,X. Thisis a pretty standard two-step process: First obtain a model frame,m, for the grid of predictors, then pass it as data tomodel.matrix to create the associated design matrix. Aspromised, this code is essentially identical to what you find inpredict.lqs.
Line 4: Obtain the coefficients,bhat. Most modelobjects have acoef method.
Lines 5–6: Obtain the covariance matrix,V, ofbhat. In many models, this can be obtained using theobject’svcov method. But not in this case. Instead, Icobbled one together using the inverse of theX’Xmatrix as in ordinary regression, and the variance estimate found in thelast element of thescale element of the object. Thisprobably under-estimates the variances and distorts the covariances,because robust estimators have some efficiency loss.
Line 7: Compute the basis for non-estimable functions. Thisapplies only when there is a possibility of rank deficiency in themodel. Butlqs methods don’t allow rank deficiencies, so itwe have fitted such a model, we can be sure that all linear functionsare estimable; we signal that by settingnbasis equal to a1 x 1 matrix ofNA. If rank deficiency were possible, theestimability package (which is required byemmeans) provides anonest.basis functionthat makes this fairly painless—I would have codednbasis = estimability::nonest.basis(Xmat).
There some subtleties you need to know regarding estimability.Suppose the model is rank-deficient, so that the design matrixX hasp columns but rankr <p. In that case,bhat should be of lengthp (notr), and there should bep -relements equal toNA, corresponding to columns ofX that were excluded from the fit. Also,Xshould have allp columns. In other words, do not alter orthrow-out columns ofX or their corresponding elements ofbhat—even those withNA coefficients—as theyare essential for assessing estimability.V should ber xr, however—the covariance matrix for thenon-excluded predictors.
Lines 8–9: Obtaindffun anddfargs.This is a little awkward because it is designed to allow support formixed models, where approximate methods may be used to obtain degrees offreedom. The functiondffun is expected to have twoarguments:k, the vector of coefficients ofbhat, anddfargs, a list containing anyadditional arguments. In this case (and in many other models), thedegrees of freedom are the same regardless ofk. We put therequired degrees of freedom indfargs and writedffun so that it simply returns that value. (Note: Ifasymptotic tests and CIs are desired, returnInf degrees offreedom.)
Line 10: Return these results in a named list.

Back to Contents

Communication between methods

If you need to pass information obtained inrecover_data() to theemm_basis() method,simply incorporate it asattr(data, "misc") wheredata is the dataset returned byrecover_data(). Subsequently, that attribute is availableinemm_grid() by adding amisc argument.

Hook functions

Most linear models supported byemmeans havestraightforward structure: Regression coefficients, their covariancematrix, and a set of linear functions that define the reference grid.However, a few are more complex. An example is theclmclass in theordinal package, which allows a scalemodel in addition to the location model. When a scale model is used, thescale parameters are included in the model matrix, regressioncoefficients, and covariance matrix, and we can’t just use the usualmatrix operations to obtain estimates and standard errors. To facilitateusing custom routines for these tasks, theemm_basis.clmfunction function provided inemmeans includes, in itsmisc part, the names (as character constants) of two “hook”functions:misc$estHook has the name of the function tocall when computing estimates, standard errors, and degrees of freedom(for thesummary method); andmisc$vcovHookhas the name of the function to call to obtain the covariance matrix ofthe grid values (used by thevcov method). These functionsare called in lieu of the usual built-in routines for these purposes,and return the appropriately sized matrices.

In addition, you may want to apply some form of specialpost-processing after the reference grid is constructed. To provide forthis, give the name of your function to post-process the object inmisc$postGridHook. Again,clm objects (as wellaspolr in theMASS package) serve as anexample. They allow amode specification that in two cases,calls for post-processing. The"cum.prob" mode uses theregrid function to transform the linear predictor to thecumulative-probability scale. And the"prob" mode performsthis, as well as applying the contrasts necessary to convert thecumulative probabilities into the class probabilities.

Re-gridded basis

Sometimes youremm_basis method may essentially create are-gridded basis, whereX andbhat are notactually a model matrix and regression coefficients, but instead,X is the identity,bhat comprises thepredictions at each grid point, andV is the covariancematrix of those predictions. In those cases, we recommend also settingmisc$regrid.flag = TRUE. Currently, this flag is used onlyfor checking whether thenuisance argument can be used inref_grid(), and it is not absolutely necessary because wealso check to see ifX is the identity. But it provides amore efficient and reliable check. The code for nuisance factors relieson the structure of model matrices where columns are associated withmodel terms. So it is not possible to process nuisance factors with are-gridded basis.

Back to Contents

Exported methods fromemmeans

For package developers’ convenience,emmeans exportssome of its S3 methods forrecover_data and/oremm_basis—usemethods("recover_data") andmethods("emm_basis") to discover which ones. It may be thatall you need is to invoke one of those methods and perhaps make somesmall changes—especially if your model-fitting algorithm makes heavy useof an existing model type supported byemmeans.

A few additional functions are exported because they may be useful todevelopers. They are as follows:

emmeans::.all.vars(expr, retain) Some users of yourpackage may include$ or[[]] operators intheir model formulas. If you need to get the variable names,base::all.vars will probably not give you what you need.For example, ifform = ~ data$x + data[[5]], thenbase::all.vars(form) returns the names"data"and"x", whereasemmeans::.all.vars(form)returns the names"data$x" and"data[[5]]".Theretain argument may be used to specify regularexpressions for patterns to retain as parts of variable names.
emmeans::.diag(x, nrow, ncol) The basediag function has a booby trap whereby, for example,diag(57.6) returns a 57 x 57 identity matrix rather than a1 x 1 matrix with 57.6 as its only element. Butemmeans::.diag(57.6) will return the latter. The functionworks identically todiag except for its tail run aroundthe identity-matrix trap.
emmeans::.aovlist.dffun(k, dfargs) This function isexported because it is needed for computing degrees of freedom formodels fitted usingaov, but it may be useful for othercases where Satterthwaite degrees-of-freedom calculations are needed. Itrequires thedfargs slot to contain analogouscontents.
emmeans::.get.offset(terms, grid) Ifterms is a model formula containing anoffsetcall, this is will compute that offset in the context ofgrid (adata.frame).
emmeans::.my.vcov(object, ...) In a call toref_grid,emmeans, etc., the user may usevcov. to specify an alternative function or matrix to useas the covariance matrix of the fixed-effects coefficients. Thisfunction supports that feature. Calling.my.vcov in placeof thevcov method will substitute the user’svcov. when it is specified.
emmeans::.std.link.labels(fam, misc) This is usefulinemm_basis methods for generalized linear models. Call itwithfam equal to thefamily object for yourmodel, andmisc either an existing list, or justlist() if none. It returns a newmisc listcontaining the link function and, in some cases, extra features that areused for certain types of link functions (e.g., for a log link, thesetups for returning ratio comparisons withtype = "response").
emmeans::.num.key(levs, key) Returns integer indicesof elements ofkey inlevs whenkey is a character vector; or just returns integer valuesif already integer. Also throws an error if levels are mismatched orindices exceed legal range. This is useful in custom contrast functions(.emmc functions).
emmeans::.get.excl(levs, exclude, include) This issupport for theexclude andinclude argumentsof contrast functions. It checks legality and returns an integer vectorofexclude indices inlevs, given specifiedinteger or character argumentsexclude andinclude. In your.emmc function,exclude should default tointeger(0) andinclude should have no default.
emmeans::.cmpMM(X, weights, assign) creates acompact version of the model matrixX (or, preferably, itsQR decomposition). This is useful if we want anemm_basis()method to return amodel.matrix element. The returnedresult is just the R portion of the QR decomposition ofdiag(sqrt(weights)) %*% X, with theassignattribute added. IfX is aqr object, weassume the weights are already incorporated, as is true of theqr slot of alm object.

Back to Contents

Existing support for`rsm` objects

As a nontrivial example of how an existing package supportsemmeans, we show the support offered by thersm package. Itsrsm function returns anrsm object which is an extension of thelmclass. Part of that extension has to do withcoded.datastructures whereby, as is typical in response-surface analysis, modelsare fitted to variables that have been linearly transformed (coded) sothat the scope of each predictor is represented by plus or minus 1 onthe coded scale.

Without any extra support inrsm,emmeans will work just fine withrsm objects;but if the data are coded, it becomes awkward to present results interms of the original predictors on their original, uncoded scale. Theemmeans-related methods inrsm provide amode argument that may be used to specify whether we wantto work with coded or uncoded data. The possible values formode are"asis" (ignore any codings, ifpresent),"coded" (use the coded scale), and"decoded" (use the decoded scale). The first two areactually the same in that no decoding is done; but it seems clearer toprovide separate options because they represent two differentsituations.

The`recover_data` method

Note that coding is apredictor transformation, not aresponse transformation (we could have that, too, as it’s alreadysupported by theemmeans infrastructure). So, to handlethe"decode" mode, we will need to actually decode thepredictors used to construct he reference grid. That means we need tomakerecover_data a lot fancier! Here it is:

recover_data.rsm = function(object, data, mode = c("asis", "coded", "decoded"), ...) {    mode = match.arg(mode)    cod = rsm::codings(object)    fcall = object$call    if(is.null(data))                                                 # 5        data = emmeans::recover_data(fcall,                    delete.response(terms(object)), object$na.action,                    weights = weights(object), ...)    if (!is.null(cod) && (mode == "decoded")) {        pred = cpred = attr(data, "predictors")        trms = attr(data, "terms")                                    #10        data = rsm::decode.data(rsm::as.coded.data(data, formulas = cod))        for (form in cod) {            vn = all.vars(form)            if (!is.na(idx <- grep(vn[1], pred))) {                 pred[idx] = vn[2]                                     #15                cpred = setdiff(cpred, vn[1])            }        }        attr(data, "predictors") = pred        new.trms = update(trms, reformulate(c("1", cpred)))           #20        attr(new.trms, "orig") = trms        attr(data, "terms") = new.trms        attr(data, "misc") = cod    }    data}

Lines 2–7 ensure thatmode is legal, retrieves thecodings from the object, and obtain the results we would get fromrecover_data had it been anlm object. Ifmode is not"decoded",or if nocodings were used, that’s all we need. Otherwise, we need to return thedecoded data. However, it isn’t quite that simple, because the modelequation is still defined on the coded scale. Rather than to try totranslate the model coefficients and covariance matrix to the decodedscale, we elected to remember what we will need to do later to putthings back on the coded scale. In lines 9–10, we retrieve theattributes of the recovered data that provide the predictor names andterms object on the coded scale. In line 11, we replace therecovered data with the decoded data.

By the way, the codings comprise a list of formulas with the codedname on the left and the original variable name on the right. It ispossible that only some of the predictors are coded (for example,blocking factors will not be). In thefor loop in lines12–18, the coded predictor names are replaced with their decoded names.For technical reasons to be discussed later, we also remove these codedpredictor names from a copy,cpred, of the list of allpredictors in the coded model. In line 19, the"predictors"attribute ofdata is replaced with the modifiedversion.

Now, there is a nasty technicality. Theref_gridfunction inemmeans has a few lines of code afterrecover_data is called that determine if any terms in themodel convert covariates to factors or vice versa; and this code usesthe model formula. That formula involves variables on the coded scale,and those variables are no longer present in the data, so an error willoccur if it tries to access them. Luckily, if we simply take those termsout of the formula, it won’t hurt because those coded predictors wouldnot have been converted in that way. So in line 20, we updatetrms with a simpler model with the coded variables excluded(the intercept is explicitly included to ensure there will be aright-hand side even iscpred is empty). We save that astheterms attribute, and the original terms as a new"orig" attribute to be retrieved later. Thedata object, modified or not, is returned. If data havebeen decoded,ref_grid will construct its grid usingdecoded variables.

In line 23, we save the codings as the"misc" attribute,to be accessed later byemm_basis().

The`emm_basis` method

Now comes theemm_basis method that will be called afterthe grid is defined. It is listed below:

emm_basis.rsm = function(object, trms, xlev, grid,                          mode = c("asis", "coded", "decoded"), misc, ...) {    mode = match.arg(mode)    cod = misc    if(!is.null(cod) && mode == "decoded") {                          # 5        grid = rsm::coded.data(grid, formulas = cod)        trms = attr(trms, "orig")    }        m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)     #10    X = model.matrix(trms, m, contrasts.arg = object$contrasts)    bhat = as.numeric(object$coefficients)     V = emmeans::.my.vcov(object, ...)        if (sum(is.na(bhat)) > 0)                                         #15        nbasis = estimability::nonest.basis(object$qr)    else        nbasis = estimability::all.estble    dfargs = list(df = object$df.residual)    dffun = function(k, dfargs) dfargs$df                             #20    list(X = X, bhat = bhat, nbasis = nbasis, V = V,          dffun = dffun, dfargs = dfargs, misc = list())}

This is much simpler. The coding formulas are obtained frommisc (line 4) so that we don’t have to re-obtain them fromthe object. All we have to do is determine if decoding was done (line5); and, if so, convert the grid back to the coded scale (line 6) andrecover the originalterms attribute (line 7). The rest isborrowed directly from theemm_basis.lm method inemmeans. Note that line 13 uses one of the exportedfunctions we described in the preceding section. Lines 15–18 usefunctions from theestimability package to handle thepossibility that the model is rank-deficient.

A demonstration

Here’s a demonstration of thisrsm support. Thestandard example forrsm fits a second-order modelCR.rs2 to a dataset organized in two blocks and with twocoded predictors.

library("rsm")example("rsm")   ### (output is not shown) ###

First, let’s look at some results on the coded scale—which are thesame as for an ordinarylm object.

emmeans(CR.rs2, ~ x1 * x2, mode = "coded",         at = list(x1 = c(-1, 0, 1), x2 = c(-2, 2)))

##  x1 x2 emmean    SE df lower.CL upper.CL##  -1 -2   75.0 0.298  7     74.3     75.7##   0 -2   77.0 0.240  7     76.4     77.5##   1 -2   76.4 0.298  7     75.6     77.1##  -1  2   76.8 0.298  7     76.1     77.5##   0  2   79.3 0.240  7     78.7     79.9##   1  2   79.2 0.298  7     78.5     79.9## ## Results are averaged over the levels of: Block ## Confidence level used: 0.95

Now, the coded variablesx1 andx2 arederived from these coding formulas for predictorsTime andTemp:

codings(CR.rs1)

## $x1## x1 ~ (Time - 85)/5## ## $x2## x2 ~ (Temp - 175)/5

Thus, for example, a coded value ofx1 = 1 correspondsto a time of 85 + 1 x 5 = 90. Here are some results working with decodedpredictors. Note that theat list must now be given interms ofTime andTemp:

emmeans(CR.rs2, ~ Time * Temp, mode = "decoded",         at = list(Time = c(80, 85, 90), Temp = c(165, 185)))

##  Time Temp emmean    SE df lower.CL upper.CL##    80  165   75.0 0.298  7     74.3     75.7##    85  165   77.0 0.240  7     76.4     77.5##    90  165   76.4 0.298  7     75.6     77.1##    80  185   76.8 0.298  7     76.1     77.5##    85  185   79.3 0.240  7     78.7     79.9##    90  185   79.2 0.298  7     78.5     79.9## ## Results are averaged over the levels of: Block ## Confidence level used: 0.95

Since the supplied settings are the same on the decoded scale as wereused on the coded scale, the EMMs are identical to those in the previousoutput.

Dispatching and restrictions

Theemmeans package has internal support for anumber of model classes. Whenrecover_data() andemm_basis() are dispatched, a search is made for externalmethods for a given class; and if found, those methods are used insteadof the internal ones. However, certain restrictions apply when you aimto override an existing internal method:

The class name being extended must appear in the first or secondposition in the results ofclass(object). That is, you mayhave a base class for which you providerecover_data() andemm_basis() methods, and those will also work fordirect descendants thereof; but any class in third place orlater in the inheritance is ignored.
Certain classes vital to the correct operation of the package, e.g.,"lm","glm", etc., may not be overridden.

If there are no existing internal methods for the class(es) youprovide methods for, there are no restrictions on them.

Exporting and registering your methods

To make your methods available to users of your package, the methodsmust be exported. R and CRAN are evolving in a way that having S3methods in the registry is increasingly important; so it is a good ideato provide for that. The problem is not all of your package users willhaveemmeans installed.

Thus, registering the methods must be done conditionally. We providea courtesy function.emm_register() to make this simple.Suppose that your package offers two model classesfoo andbar, and it includes the corresponding functionsrecover_data.foo,recover_data.bar,emm_basis.foo, andemm_basis.bar. Then toregister these methods, add or modify the.onLoad functionin your package (traditionally saved in the source filezzz.R):

.onLoad <- function(libname, pkgname) {    if (requireNamespace("emmeans", quietly = TRUE))        emmeans::.emm_register(c("foo", "bar"), pkgname)}

You should also addemmeans (>= 1.4) andestimability (which is required byemmeans) to theSuggests field of yourDESCRIPTION file.

When registering aqdrg method, do the same as shownabove, but add the argumentqdrg = TRUE to the.emm_register() call, and inSuggests, useemmeans (>= 1.12) as well asestimability.

Back to Contents

Conclusions

It is relatively simple to write appropriate methods that work withemmeans for model objects it does not support. I hopethis vignette is helpful for understanding how. Furthermore, if you arethe developer of a package that fits linear models, I encourage you toincluderecover_data andemm_basis methods forthose classes of objects, so that users have access toemmeans support.

Back to Contents

Index of all vignette topics

Movatterモバイル変換

For developers: Extendingemmeans

emmeans package, Version 2.0.1

Contents

Introduction

Data example

Supporting`rlm`

Quick and dirty support

Supporting`lqs` objects

The`recover_data` method

Special arguments

Error handling

The`emm_basis` method

Dissecting`emm_basis.lqs`

Communication between methods

Hook functions

Re-gridded basis

Exported methods fromemmeans

Existing support for`rsm` objects

The`recover_data` method

The`emm_basis` method

A demonstration

Dispatching and restrictions

Exporting and registering your methods

Conclusions

Movatterモバイル変換

For developers: Extendingemmeans

emmeans package, Version 2.0.1

Contents

Introduction

Data example

Supportingrlm

Quick and dirty support

Supportinglqs objects

Therecover_data method

Special arguments

Error handling

Theemm_basis method

Dissectingemm_basis.lqs

Communication between methods

Hook functions

Re-gridded basis

Exported methods fromemmeans

Existing support forrsm objects

Therecover_data method

Theemm_basis method

A demonstration

Dispatching and restrictions

Exporting and registering your methods

Conclusions

Supporting`rlm`

Supporting`lqs` objects

The`recover_data` method

The`emm_basis` method

Dissecting`emm_basis.lqs`

Existing support for`rsm` objects

The`recover_data` method

The`emm_basis` method