Movatterモバイル変換

Type:

Package

Title:

Targeted Inference

Version:

0.7

Date:

2025-12-09

Author:

Klaus K. Holst [aut, cre], Benedikt Sommer [aut], Andreas Nordland [aut], Christian B. Pipper [ctb]

Maintainer:

Klaus K. Holst <klaus@holst.it>

Description:

Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).

Depends:

R (≥ 4.1)

Imports:

R6, Rcpp (≥ 1.0.0), abind, cli, future.apply, lava (≥1.8.2), methods, mets (≥ 1.3.8), quadprog, progressr, rlang,survival

Suggests:

SuperLearner (≥ 2.0-28), MASS, cmprsk, data.table, e1071,earth, glmnet, grf, hal9001, mgcv, nnls, optimx, polle (≥1.5), pracma, quarto, randomForestSRC, ranger, riskRegression,scatterplot3d, tinytest, viridisLite, xgboost (≥ 3.1.2.1),

BugReports:

https://github.com/kkholst/targeted/issues

URL:

https://kkholst.github.io/targeted/

License:

Apache License (== 2.0)

LinkingTo:

Rcpp, RcppArmadillo

LazyLoad:

yes

NeedsCompilation:

yes

ByteCompile:

yes

RcppModules:

riskregmodel

Encoding:

UTF-8

RoxygenNote:

7.3.3

VignetteBuilder:

quarto

SystemRequirements:

Quarto command line tools(https://github.com/quarto-dev/quarto-cli).

Packaged:

2025-12-09 17:27:41 UTC; kkzh

Repository:

CRAN

Date/Publication:

2025-12-10 06:20:08 UTC

targeted: Targeted Inference

Description

logo

Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005)doi:10.1111/j.1541-0420.2005.00377.x), variable importance and conditional average treatment effects (CATE) (van der Laan (2006)doi:10.2202/1557-4679.1008), estimators for risk differences and relative risks (Richardson et al. (2017)doi:10.1080/01621459.2016.1192546), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022)doi:10.1111/rssb.12504).

Author(s)

Maintainer: Klaus K. Holstklaus@holst.it

Authors:

Benedikt Sommerbenediktsommer92@gmail.com
Andreas Nordlandandreasnordland@gmail.com

Other contributors:

Christian B. Pippercbpipper@gmail.com [contributor]

Klaus K. Holst (Maintainer)klaus@holst.it

References

Bang & Robins (2005) Doubly Robust Estimation in Missing Data andCausal Inference Models, Biometrics.

Vansteelandt & Dukes (2022) Assumption-lean inference forgeneralised linear model parameters, Journal of the Royal StatisticalSociety: Series B (Statistical Methodology).

Thomas S. Richardson, James M. Robins & Linbo Wang (2017) On Modeling andEstimation for the Relative Risk and Risk Difference, Journal of the AmericanStatistical Association.

Mark J. van der Laan (2006) Statistical Inference for Variable Importance,The International Journal of Biostatistics.

Examples

## Not run: example(riskreg)example(cate)example(ate)example(calibration)## End(Not run)

ML model

Description

Wrapper for ml_model

Usage

ML(formula, model = "glm", ...)

Arguments

formula

formula

model

model (sl, rf, pf, glm, ...)

...

additional arguments to model object

Responder Average Treatment Effect

Description

Estimation of the Average Treatment Effect among Responders

Usage

RATE(  response,  post.treatment,  treatment,  data,  family = gaussian(),  M = 5,  pr.treatment,  treatment.level,  SL.args.response = list(family = gaussian(), SL.library = c("SL.mean", "SL.glm")),  SL.args.post.treatment = list(family = binomial(), SL.library = c("SL.mean", "SL.glm")),  preprocess = NULL,  efficient = TRUE,  ...)

Arguments

response

Response formula (e.g, Y ~ D*A)

post.treatment

Post treatment marker formula (e.g., D ~ W)

treatment

Treatment formula (e.g, A ~ 1)

data

data.frame

family

Exponential family for response (default gaussian)

M

Number of folds in cross-fitting (M=1 is no cross-fitting)

pr.treatment

(optional) Randomization probability of treatment.

treatment.level

Treatment level in binary treatment (default 1)

SL.args.response

Arguments to SuperLearner for the response model

SL.args.post.treatment

Arguments to SuperLearner for the posttreatment indicator

preprocess

(optional) Data preprocessing function

efficient

If TRUE, the estimate will be efficient. If FALSE, theestimate will be a simple plug-in estimate.

...

Additional arguments to lower level functions

Value

estimate object

Author(s)

Andreas Nordland, Klaus K. Holst

Responder Average Treatment Effect

Description

Estimation of the Average Treatment Effect among Responders for SurvivalOutcomes

Usage

RATE.surv(  response,  post.treatment,  treatment,  censoring,  tau,  data,  M = 5,  pr.treatment,  call.response,  args.response = list(),  SL.args.post.treatment = list(family = binomial(), SL.library = c("SL.mean", "SL.glm")),  call.censoring,  args.censoring = list(),  preprocess = NULL,  ...)

Arguments

response

Response formula (e.g., Surv(time, event) ~ D + W).

post.treatment

Post treatment marker formula (e.g., D ~ W)

treatment

Treatment formula (e.g, A ~ 1)

censoring

Censoring formula (e.g., Surv(time, event == 0) ~ D + A +W)).

tau

Time-point of interest, see Details.

data

data.frame

M

Number of folds in cross-fitting (M=1 is no cross-fitting)

pr.treatment

(optional) Randomization probability of treatment.

call.response

Model call for the response model (e.g. "mets::phreg").

args.response

Additional arguments to the response model.

SL.args.post.treatment

Arguments to SuperLearner for the posttreatment indicator

call.censoring

Similar to call.response.

args.censoring

Similar to args.response.

preprocess

(optional) Data preprocessing function

...

Additional arguments to lower level functions

Details

Estimation of

\frac{P(T \leq \tau|A=1) - P(T \leq \tau|A=1)}{E[D|A=1]}

under right censoring based on plug-in estimates ofP(T \leq \tau|A=a)andE[D|A=1].

An efficient one-step estimator ofP(T \leq \tau|A=a) is constructedusing the efficient influence function

\frac{I\{A=a\}}{P(A = a)} \Big(\frac{\Delta}{S^c_{0}(\tilde T|X)}I\{\tilde T \leq \tau\} + \int_0^\tau\frac{S_0(u|X)-S_0(\tau|X)}{S_0(u|X)S^c_0(u|X)} d M^c_0(u|X)\Big)

+ \Big(1 - \frac{I\{A=a\}}{P(A = a)}\Big)F_0(\tau|A=a, W)- P(T \leq \tau|A=a).

An efficient one-step estimator ofE[D|A=1] is constructed using theefficient influence function

\frac{A}{P(A = 1)}\left(D-E[D|A=1, W]\right) + E[D|A=1, W] -E[D|A=1].

Value

estimate object

Author(s)

Andreas Nordland, Klaus K. Holst

SuperLearner wrapper for learner

Description

SuperLearner wrapper for learner

Usage

SL(  formula = ~.,  ...,  SL.library = c("SL.mean", "SL.glm"),  binomial = FALSE,  data = NULL,  info = "SuperLearner")

Arguments

formula

Model design

...

Additional arguments for SuperLearner::SuperLearner

SL.library

character vector of prediction algorithms

binomial

boolean specifying binomial or gaussian family (defaultFALSE)

data

Optional data.frame

info

model information (optional)

Value

learner object

Author(s)

Klaus Kähler Holst

AIPW estimator

Description

AIPW for the mean (and linear projections of the EIF) with missingobservations

Usage

aipw(response_model, propensity_model, formula = ~1, data, ...)

Arguments

response_model

Model for the response given covariates (learner orformula)

propensity_model

Optional missing data mechanism model (propensitymodel) (learner or formula)

formula

design specifying the OLS estimator with outcome given by theEIF

data

data.frame

...

additional arguments (seecate())

Examples

m <- lava::lvm(y ~ x+z, r ~ x) |>     lava::distribution(~ r, value = lava::binomial.lvm()) |>     transform(y0~r+y, value = \(x) { x[x[,1]==0,2] <- NA; x[,2] })d <- lava::sim(m,1e3,seed=1)aipw(y0 ~ x, data=d)

Assumption Lean inference for generalized linear model parameters

Description

Assumption lean inference via cross-fitting (Double ML). See<doi:10.1111/rssb.12504

Usage

alean(  response_model,  exposure_model,  data,  link = "identity",  g_model,  nfolds = 1,  silent = FALSE,  mc.cores,  ...)

Arguments

response_model

formula orlearner object (formula => glm)

exposure_model

model for the exposure

data

data.frame

link

Link function (g)

g_model

Model forE[g(Y|A,W)|W]

nfolds

Number of folds

silent

supress all messages and progressbars

mc.cores

mc.cores Optional number of cores.parallel::mcmapply used instead of future

...

additional arguments to future.apply::future_mapply

Details

LetY be the response variable,A the exposure andWcovariates. The target parameter is:

\Psi(P) = \frac{E(Cov[A,g\{E(Y|A,W)\}\mid W])} {E\{Var(A\mid W)\}}

Theresponse_model is the model forE(Y|A,W), andexposure_model is the model forE(A|W).link specifiesg.

Value

alean.targeted object

Author(s)

Klaus Kähler Holst

Examples

sim1 <- function(n, family=gaussian(), ...) {   m <- lava::lvm() |>     lava::distribution(~y, value=lava::binomial.lvm()) |>     lava::regression('a', value=function(l) l) |>     lava::regression('y', value=function(a,l) a + l)     if (family$family=="binomial")        lava::distribution(m, ~a) <- lava::binomial.lvm()   lava::sim(m, n)}library(splines)f <- binomial()d <- sim1(1e4, family=f)e <- alean( response_model=learner_glm(y ~ a + bs(l, df=3), family=binomial), exposure_model=learner_glm(a ~ bs(l, df=3), family=f), data=d, link = "logit", mc.cores=1, nfolds=1)ee <- alean(response_model=learner_glm(y ~ a + l, family=binomial),           exposure_model=learner_glm(a ~ l),           data=d,           link = "logit", mc.cores=1, nfolds=1)e

AIPW (doubly-robust) estimator for Average Treatment Effect

Description

Augmented Inverse Probability Weighting estimator for the Average (Causal)Treatment Effect. All nuisance models are here parametric (glm). For a moregeneral approach see thecate implementation. In this implementationthe standard errors are correct even when the nuisance models aremis-specified (the influence curve is calculated including the term comingfrom the parametric nuisance models). The estimate is consistent if eitherthe propensity model or the outcome model / Q-model is correctly specified.

Usage

ate(  formula,  data = parent.frame(),  weights,  offset,  family = stats::gaussian(identity),  nuisance = NULL,  propensity = nuisance,  all,  labels = NULL,  adjust.nuisance = TRUE,  adjust.propensity = TRUE,  ...)

Arguments

formula

formula (see details below)

data

data.frame

weights

optional frequency weights

offset

optional offset (character or vector). can also be specified inthe formula.

family

Exponential family argument for outcome model

nuisance

outcome regression formula (Q-model)

propensity

propensity model formula

all

when TRUE all standard errors are calculated (default TRUE whenexposure only has two levels)

labels

optional treatment labels

adjust.nuisance

adjust for uncertainty due to estimation of outcomeregression model parameters

adjust.propensity

adjust for uncertainty due to estimation ofpropensity regression model parameters

...

additional arguments to lower level functions

Details

The formula may either be specified as: response ~ treatment |nuisance-formula | propensity-formula

For example:ate(y~a | x+z+a | x*z, data=...)

Alternatively, as a list:ate(list(y~a, ~x+z, ~x*z), data=...)

Or using the nuisance (and propensity argument):ate(y~a, nuisance=~x+z, ...)

Value

An object of class 'ate.targeted' is returned. Seetargeted-class for more details about this class and itsgeneric functions.

Author(s)

Klaus K. Holst

Examples

m <- lava::lvm(y ~ a+x, a~x) |>     lava::distribution(~y, value = lava::binomial.lvm()) |>     lava::ordinal(K=4, ~a) |>     transform(~a, value = factor)d <- lava::sim(m, 1e3, seed=1)# (a <- ate(y~a|a*x|x, data=d))(a <- ate(y~a, nuisance=~a*x, propensity=~x, data = d))# Comparison with randomized experimentm0 <- lava::cancel(m, a~x)lm(y~a-1, lava::sim(m0,2e4))# Choosing a different contrast for the association measuressummary(a, contrast=c(2,4))

Calibration (training)

Description

Calibration for multiclassication methods

Usage

calibration(  pr,  cl,  weights = NULL,  threshold = 10,  method = "bin",  breaks = nclass.Sturges,  df = 3,  ...)

Arguments

pr

matrix with probabilities for each class

cl

class variable

weights

counts

threshold

do not calibrate if less then 'threshold' events

method

either 'isotonic' (pava), 'logistic','mspline' (monotone spline), 'bin' (local constant)

breaks

optional number of bins (only for method 'bin')

df

degrees of freedom (only for spline methods)

...

additional arguments to lower level functions

Details

...

Value

An object of class 'calibration' is returned.Seecalibration-classfor more details about this class and its generic functions.

Author(s)

Klaus K. Holst

Examples

sim1 <- function(n, beta=c(-3, rep(.5,10)), rho=.5) { p <- length(beta)-1 xx <- lava::rmvn0(n,sigma=diag(nrow=p)*(1-rho)+rho) y <- rbinom(n, 1, lava::expit(cbind(1,xx)%*%beta)) d <- data.frame(y=y, xx) names(d) <- c("y",paste0("x",1:p)) return(d)}set.seed(1)beta <- c(-2,rep(1,10))d <- sim1(1e4, beta=beta)a1 <- naivebayes(y ~ ., data=d)a2 <- glm(y ~ ., data=d, family=binomial)## a3 <- randomForest(factor(y) ~ ., data=d, family=binomial)d0 <- sim1(1e4, beta=beta)p1 <- predict(a1, newdata=d0)p2 <- predict(a2, newdata=d0, type="response")## p3 <- predict(a3, newdata=d0, type="prob")c2 <- calibration(p2, d0$y, method="isotonic")c1 <- calibration(p1, d0$y, breaks=100)if (interactive()) {  plot(c1)  plot(c2,col="red",add=TRUE)  abline(a=0,b=1)  with(c1$xy[[1]], points(pred,freq,type="b", col="red"))}set.seed(1)beta <- c(-2,rep(1,10))dd <- lava::csplit(sim1(1e4, beta=beta), k=3)mod <- naivebayes(y ~ ., data=dd[[1]])p1 <- predict(mod, newdata=dd[[2]])cal <- calibration(p1, dd[[2]]$y)p2 <- predict(mod, newdata=dd[[3]])pp <- predict(c1, p2)cc <- calibration(pp, dd[[3]]$y)if (interactive()) {#'  plot(cal)  plot(cc, add=TRUE, col="blue")}

calibration class object

Description

The functionscalibration returns an objectof the classcalibration.

An object of class 'calibration' is a list with at least thefollowing components:

stepfun: estimated step-functions(seestepfun) for each class
classes: the unique classes
model: model/method type (string)
xy: list of data.frame's with predictions (pr) and estimatedprobabilities of success (only for 'bin' method)

Value

objects of the S3 class 'calibration'

S3 generics

The following S3 generic functions are available for an objectof classtargeted:

predict: Apply calibration to new data.
plot: Plot the calibration curves (reliability plot).
print: Basic print method.

Examples

## See example(calibration) for examples

Conditional Average Treatment Effect estimation

Description

Conditional Average Treatment Effect estimation with cross-fitting.

Usage

cate(  response.model,  propensity.model,  cate.model = ~1,  calibration.model = NULL,  data,  contrast,  nfolds = 1,  rep = 1,  silent = FALSE,  stratify = FALSE,  mc.cores = NULL,  rep.type = c("nuisance", "average"),  var.type = "IC",  second.order = TRUE,  response_model = deprecated,  cate_model = deprecated,  propensity_model = deprecated,  treatment = deprecated,  ...)

Arguments

response.model

formula or learner object (formula => learner_glm)

propensity.model

formula or learner object (formula => learner_glm)

cate.model

formula specifying regression design for conditionalaverage treatment effects

calibration.model

linear calibration model. Specify covariates inaddition to predicted potential outcomes to include in the calibration.

data

data.frame

contrast

treatment contrast (default 1 vs 0)

nfolds

number of folds

rep

number of replications of cross-fitting procedure

silent

suppress all messages and progressbars

stratify

if TRUE the response.model will be stratified by treatment

mc.cores

(optional) number of cores. parallel::mcmapply used insteadof future

rep.type

repeated cross-fitting applied by averaging nuisance models(rep.type="nuisance") or by average estimates from each replication(rep.type="average").

var.type

when equal to "IC" the asymptotic variance is derived fromthe influence function. Otherwise, based on expressions in Bannick et al.(2025) valid under different covariate-adaptive randomization schemes (onlyavailable for ATE and whencalibration.model is also specified)

second.order

add seconder order term to IF to handle misspecificationof outcome models

response_model

Deprecated. Use response.model instead.

cate_model

Deprecated. Use cate.model instead.

propensity_model

Deprecated. Use propensity.model instead.

treatment

Deprecated. Use cate.model instead.

...

additional arguments to future.apply::future_mapply

Details

We have observed data(Y,A,W) whereY is the response variable,A the binary treatment, andW covariates. We further letVbe a subset of the covariates. Define the conditional potential mean outcome

\psi_{a}(P)(V) = E_{P}[E_{P}(Y\mid A=a, W)|V]

and letm(V;\beta) denote a parametric working model, then the target parameter is themean-squared error

\beta(P) = \operatorname{argmin}_{\beta}E_{P}[\{\Psi_{1}(P)(V)-\Psi_{0}(P)(V)\} - m(V; \beta)]^{2}

Value

cate.targeted object

Author(s)

Klaus Kähler Holst, Andreas Nordland

References

Mark J. van der Laan (2006) Statistical Inference for VariableImportance, The International Journal of Biostatistics.

Examples

sim1 <- function(n=1000, ...) {  w1 <- rnorm(n)  w2 <- rnorm(n)  a <- rbinom(n, 1, plogis(-1 + w1))  y <- cos(w1) + w2*a + 0.2*w2^2 + a + rnorm(n)  data.frame(y, a, w1, w2)}d <- sim1(5000)## ATEcate(cate.model=~1,     response.model=y~a*(w1+w2),     propensity.model=a~w1+w2,     data=d)## CATEcate(cate.model=~1+w2,     response.model=y~a*(w1+w2),     propensity.model=a~w1+w2,     data=d)## Not run:  ## superlearner examplemod1 <- list(   glm = learner_glm(y~w1+w2),   gam = learner_gam(y~s(w1) + s(w2)))s1 <- learner_sl(mod1, nfolds=5)cate(cate.model=~1,     response.model=s1,     propensity.model=learner_glm(a~w1+w2, family=binomial),     data=d,     stratify=TRUE)## End(Not run)

Conditional Relative Risk estimation

Description

Conditional average treatment effect estimation via Double Machine Learning

Usage

cate_link(  treatment,  link = "identity",  response_model,  propensity_model,  importance_model,  contrast = c(1, 0),  data,  nfolds = 5,  type = "dml1",  ...)

Arguments

treatment

formula specifying treatment and variables to condition on

link

Link function

response_model

SL object

propensity_model

SL object

importance_model

SL object

contrast

treatment contrast (default 1 vs 0)

data

data.frame

nfolds

Number of folds

type

'dml1' or 'dml2'

...

additional arguments to SuperLearner

Value

cate.targeted object

Author(s)

Klaus Kähler Holst & Andreas Nordland

Examples

# Example 1:sim1 <- function(n=1e4,                 seed=NULL,                 return_model=FALSE, ...){suppressPackageStartupMessages(require("lava"))if (!is.null(seed)) set.seed(seed)m <- lava::lvm()distribution(m, ~x) <- gaussian.lvm()distribution(m, ~v) <- gaussian.lvm(mean = 10)distribution(m, ~a) <- binomial.lvm("logit")regression(m, "a") <- function(v, x){.1*v + x}distribution(m, "y") <- gaussian.lvm()regression(m, "y") <- function(a, v, x){v+x+a*x+a*v*v}if (return_model) return(m)lava::sim(m, n = n)}if (require("SuperLearner",quietly=TRUE)) {  d <- sim1(n = 1e3, seed = 1)  e <- cate_link(data=d,           type = "dml2",           treatment = a ~ v,           response_model = y~ a*(x + v + I(v^2)),           importance_model = SL(D_ ~ v + I(v^2)),           nfolds = 10)  summary(e) # the true parameters are c(1,1)}

Construct a learner

Description

Construct a learner

Arguments

info

(character) Optional information to describe the instantiatedlearner object.

formula

(formula) Formula specifying response and design matrix.

learner.args

(list) Additional arguments tolearner$new().

Value

learner object.

cross_validated class object

Description

The functionscv returns an object of the typecross_validated.

An object of class 'cross_validated' is a list with at least thefollowing components:

cv: An array with the model score(s) evaluated for each fold,repetition, and model estimates(seeestimate.default)
names: Names (character vector) of the models
rep: number of repetitions of the CV
folds: Number of folds of the CV

Value

objects of the S3 class 'cross_validated'

S3 generics

The following S3 generic functions are available for an object ofclasscross_validated:

coef: Extract average model scores from thecross-validation procedure.
print: Basic print method.
summary: Summary of the cross-validation procedure.

Examples

# See example(cv) for examples

Conditional Relative Risk estimation

Description

Conditional Relative Risk estimation via Double Machine Learning

Usage

crr(  treatment,  response_model,  propensity_model,  importance_model,  contrast = c(1, 0),  data,  nfolds = 5,  type = "dml1",  ...)

Arguments

treatment

formula specifying treatment and variables to condition on

response_model

SL object

propensity_model

SL object

importance_model

SL object

contrast

treatment contrast (default 1 vs 0)

data

data.frame

nfolds

Number of folds

type

'dml1' or 'dml2'

...

additional arguments to SuperLearner

Value

cate.targeted object

Author(s)

Klaus Kähler Holst & Andreas Nordland

Examples

sim1 <- function(n=1e4,                 seed=NULL,                 return_model=FALSE, ...){suppressPackageStartupMessages(require("lava"))if (!is.null(seed)) set.seed(seed)m <- lava::lvm()distribution(m, ~x) <- gaussian.lvm()distribution(m, ~v) <- gaussian.lvm(mean = 10)distribution(m, ~a) <- binomial.lvm("logit")regression(m, "a") <- function(v, x){.1*v + x}distribution(m, "y") <- gaussian.lvm()regression(m, "y") <- function(a, v, x){v+x+a*x+a*v*v}if (return_model) return(m)lava::sim(m, n = n)}d <- sim1(n = 2e3, seed = 1)if (require("SuperLearner",quietly=TRUE)) {  e <- crr(data=d,   type = "dml2",   treatment = a ~ v,   response_model = learner_glm(y~ a*(x + v + I(v^2))),   importance_model = learner_glm(D_ ~ v + I(v^2)),   propensity_model = learner_glm(a ~ x + v + I(v^2), family=binomial),           nfolds = 2)  summary(e) # the true parameters are c(1,1)}

Predict the cumulative hazard/survival function for a survival model

Description

Predict the cumulative hazard/survival function for a survival model

Usage

cumhaz(  object,  newdata,  times = NULL,  individual.time = FALSE,  extend = FALSE,  ...)

Arguments

object

Survival model object: phreg, coxph, rfsrc, ranger

newdata

data.frame

times

numeric vector: Time points at which the survival model isevaluated. If NULL, the time points associated with the survival model isused.

individual.time

logical: If TRUE the survival object is evaluated atdifferent time points for each row in newdata. The number of rows innewdata and the length of times must be the same.

extend

if TRUE, prints information for all specified 'times’, even ifthere are no subjects left at the end of the specified ‘times’ (seesurvival::summary.survfit).

...

Additional arguments.

Value

List with elements:

time: numeric vector
chf:cumulative hazard function. If individual.time = FALSE, matrix withdimension (nrow(newdata), length(times)). If individual.time = TRUE, vectorof length length(times).
surv: survival function, exp(-chf).
dchf: t(diff(rbind(0, t(chf))))

Author(s)

Klaus K. Holst, Andreas Nordland

Cross-validation

Description

Generic cross-validation function

Usage

## Default S3 method:cv(  object,  data,  response = NULL,  nfolds = 5,  rep = 1,  weights = NULL,  model.score = scoring,  seed = NULL,  shared = NULL,  args.pred = NULL,  args.future = list(),  mc.cores,  silent = FALSE,  ...)

Arguments

object

List oflearner objects

data

data.frame or matrix

response

Response variable (vector or name of column indata).

nfolds

Number of folds (nfolds=0 simple test/train split into twofolds 1:([n]/2), ([n]+1/2):n with last part used for testing)

rep

Number of repetitions (default 1)

weights

Optional frequency weights

model.score

Model scoring metric (default: MSE / Brier score). Must bea function with arguments response and prediction, and may optionallyinclude weights, object and newdata arguments

seed

Random seed (argument parsed to future_Apply::future_lapply)

shared

Function applied to each fold with results send to each model

args.pred

Optional arguments to prediction function (see detailsbelow)

args.future

Arguments to future.apply::future_mapply

mc.cores

Optional number of cores.parallel::mcmapply used insteadof future

silent

suppress all messages and progressbars

...

Additional arguments parsed to elements inobject

Details

object should be list of objects of classlearner.Alternatively, each element of models should be a list with a fittingfunction and a prediction function.

Theresponse argument can optionally be a named list where the name isthen used as the name of the response argument in models. Similarly, if datais a named list with a single data.frame/matrix then this name will be usedas the name of the data/design matrix argument in models.

Value

An object of class 'cross_validated' is returned. Seecross_validated-class for more details about this class andits generic functions.

Author(s)

Klaus K. Holst

Examples

m <- list(learner_glm(Sepal.Length~1),          learner_glm(Sepal.Length~Species),          learner_glm(Sepal.Length~Species + Petal.Length))x <- cv(m, rep=10, data=iris)x

Cross-validation forlearner_sl

Description

Cross-validation estimation of the generalization error of thesuper learner and each of the separate models in the ensemble. Both thechosen model scoring metrics as well as the model weights of the stackedensemble.

Usage

## S3 method for class 'learner_sl'cv(object, data, nfolds = 5, rep = 1, model.score = scoring, ...)

Arguments

object

(learner_sl) Instantiatedlearner_sl object.

data

data.frame or matrix

nfolds

Number of folds (nfolds=0 simple test/train split into twofolds 1:([n]/2), ([n]+1/2):n with last part used for testing)

rep

Number of repetitions (default 1)

model.score

Model scoring metric (default: MSE / Brier score). Must bea function with arguments response and prediction, and may optionallyinclude weights, object and newdata arguments

...

Additional arguments parsed to elements inobject

Examples

sim1 <- function(n = 5e2) {   x1 <- rnorm(n, sd = 2)   x2 <- rnorm(n)   y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)   data.frame(y, x1, x2)}sl <- learner_sl(list(                   "mean" = learner_glm(y ~ 1),                   "glm" = learner_glm(y ~ x1),                   "glm2" = learner_glm(y ~ x1 + x2)                  ))cv(sl, data = sim1(), rep = 2)

Cast warning for deprecated function argument names

Description

Cast warning for deprecated function argument names

Usage

deprecate_arg_warn(old, new, fun, vers)

Arguments

old

deprecated argument name

new

argument that should be used instead

fun

function name where arguments are deprecated

vers

version when argument is deprecated

Deprecated argument names

Description

Deprecated argument names

Arguments

response_model

Deprecated. Use response.model instead.

propensity_model

Deprecated. Use propensity.model instead.

cate_model

Deprecated. Use cate.model instead.

treatment

Deprecated. Use cate.model instead.

Extract design matrix

Description

Extract design matrix from data.frame and formula

Usage

design(  formula,  data,  ...,  intercept = FALSE,  response = TRUE,  rm_envir = FALSE,  specials = NULL,  specials.call = NULL,  levels = NULL,  design.matrix = TRUE)

Arguments

formula

formula

data

data.frame

...

additional arguments (e.g, specials such weights, offsets, ...)

intercept

(logical) If FALSE an intercept is not included in thedesign matrix

response

(logical) if FALSE the response variable is dropped

rm_envir

Remove environment

specials

character vector specifying functions in the formula thatshould be marked as special in theterms object

specials.call

(call) specials optionally defined as a call-type

levels

a named list of character vectors giving the full set of levelsto be assumed for each factor

design.matrix

(logical) if FALSE then only response and specials arereturned. Otherwise, the design.matrixx is als part of the returnedobject.

Value

An object of class 'design'

Author(s)

Klaus Kähler Holst

Estimation of mean clinical outcome truncated by event process

Description

LetY denote the clinical outcome,A the binary treatmentvariable,X baseline covariates,T the failure time,andepsilon=1,2 the cause of failure.The following are our two target parameters

E(Y|T>t, A=1)- E(Y|T>t, A=0)

P(T<t,\epsilon=1|A=1)- P(T<t,\epsilon=1|A=0)

Usage

estimate_truncatedscore(  data,  mod.y,  mod.r,  mod.a,  mod.event,  time,  cause = NULL,  cens.code = 0,  naive = FALSE,  control = list(),  ...)

Arguments

data

(data.frame)

mod.y

(formula or learner) Model for clinical outcome given T>time.Using a formula specifies a glm with an identity link (see example).

mod.r

(formula or learner) Model for missing data mechanism forclinical outcome at T=time. Using a formula specifies a glm with a loglink.

mod.a

(formula or learner) Treatment model (in RCT should just be 'a~ 1'). Using a formula specifies a glm with a log link.

mod.event

(formula) Model for time-to-event process('Event(time,status) ~ x').

time

(numeric) Landmark time.

cause

(integer) Primary event (in the 'status' variable of the 'Event'statement).

cens.code

(integer) Censoring code.

naive

(logical) If TRUE, the unadjusted estimates ignoring baselinecovariates is returned as the attribute 'naive'.

control

(list) optimization routine parameters.

...

Additional arguments passed tomets::binregATE.

Value

lava::estimate.default object

Author(s)

Klaus Kähler Holst

Examples

data(truncatedscore)mod1 <- learner_glm(y ~ a * (x1 + x2))mod2 <- learner_glm(r ~ a * (x1 + x2), family = binomial)a <- estimate_truncatedscore(  data = truncatedscore,  mod.y = mod1,  mod.r = mod2,  mod.a = a ~ 1,  mod.event = mets::Event(time, status) ~ x1+x2,  time = 2)s <- summary(a, noninf.t = -0.1)print(s)parameter(s)# the above is equivalent to# a <- estimate_truncatedscore(#   data = truncatedscore,#   mod.y = y ~ a * (x1 + x2),#   mod.r = r ~ a * (x1 + x2),#   mod.a = a ~ 1,#   mod.event = mets::Event(time, status) ~ x1+x2,#   time = 2# )

Create a list from all combination of input variables

Description

Similar toexpand.grid function, this function creates all combinationsof the input arguments but returns the result as a list.

Usage

expand.list(..., INPUT = NULL, envir = NULL)

Arguments

...

input variables

INPUT

optional list of variables

envir

environment environment to evalute formulas in

Value

list

Author(s)

Klaus Kähler Holst

Examples

expand.list(x = 2:4, z = c("a", "b"))

Integral approximation of a time dependent function.Computes an approximation of`\int_start^stop S(t) dt`, where`S(t)` is a survival function, for a selection of start and stop timepoints.

Description

Integral approximation of a time dependent function.Computes an approximation of\int_start^stop S(t) dt, whereS(t) is a survival function, for a selection of start and stop timepoints.

Usage

int_surv(times, surv, start = 0, stop = max(times), extend = FALSE)

Arguments

times

Numeric vector, sorted time points.

surv

Numeric vector, values of a survival function evaluated at timepoints given bytimes.

start

Numeric vector, start of the integral.

stop

Numeric vector, end of the integral.

extend

(logical) If TRUE, integral is extended beyond the lastobserved time point

Value

Numeric vector, value of the integral.

Author(s)

Andreas Nordland

R6 class for prediction models

Description

Interface for statistical and machine learning models to be usedfor nuisance model estimation in targeted learning.

The following list provides an overview of constructors for many commonlyused models.

Regression and classification:learner_glm,learner_gam,learner_grf,learner_hal,learner_glmnet_cv,learner_svm,learner_xgboost,learner_mars
Regression:learner_isoreg
Classification:learner_naivebayes
Ensemble (super learner):learner_sl

Public fields

info: Optional information/name of the model

Active bindings

clear: Remove fitted model from the learner object
fit: Return estimated model object.
formula: Return model formula. Uselearner$update() toupdate the formula.

Methods

Public methods

Method`new()`

Create a new prediction model object

Usage

learner$new(  formula = NULL,  estimate,  predict = stats::predict,  predict.args = NULL,  estimate.args = NULL,  info = NULL,  specials = c(),  formula.keep.specials = FALSE,  intercept = FALSE)

Arguments

formula: formula specifying outcome and design matrix
estimate: function for fitting the model. This must be a functionwith response, 'y', and design matrix, 'x'. Alternatively, a functionwith a formula and data argument. See the examples section.
predict: prediction function (must be a function of modelobject, 'object', and new design matrix, 'newdata')
predict.args: optional arguments to prediction function
estimate.args: optional arguments to estimate function
info: optional description of the model
specials: optional specials terms (weights, offset,id, subset, ...) passed on todesign
formula.keep.specials: if TRUE then special terms defined byspecials will be removed from the formula before it is being passed tothe estimate print.function()
intercept: (logical) include intercept in design matrix

Method`estimate()`

Estimation method

Usage

learner$estimate(data, ..., store = TRUE)

Arguments

data: data.frame
...: Additional arguments to estimation method
store: Logical determining if estimated model should bestored inside the class.

Method`predict()`

Prediction method

Usage

learner$predict(newdata, ..., object = NULL)

Arguments

newdata: data.frame
...: Additional arguments to prediction method
object: Optional model fit object

Method`update()`

Update formula

Usage

learner$update(formula)

Arguments

formula: formula or character which defines the new response

Method`print()`

Print method

Usage

learner$print()

Method`summary()`

Summary method to provide more extensive information thanlearner$print().

Usage

learner$summary()

Returns

summarized_learner object, which is a list with the followingelements:

info: description of the learner
formula: formula specifying outcome and design matrix
estimate: function for fitting the model
estimate.args: arguments to estimate function
predict: function for making predictions from fitted model
predict.args: arguments to predict function
specials: provided special terms
intercept: include intercept in design matrix

Examples

lr <- learner_glm(y ~ x, family = "nb")lr$summary()lr_sum <- lr$summary() # store returned summary in new objectnames(lr_sum)print(lr_sum)

Method`response()`

Extract response from data

Usage

learner$response(data, eval = TRUE, ...)

Arguments

data: data.frame
eval: when FALSE return the untransformed outcome(i.e., return 'a' if formula defined as I(a==1) ~ ...)
...: additional arguments todesign

Method`design()`

Generatedesign object (design matrix and response) from data

Usage

learner$design(data, ...)

Arguments

data: data.frame
...: additional arguments todesign

Method`opt()`

Get options

Usage

learner$opt(arg)

Arguments

arg: name of option to get value of

Method`clone()`

The objects of this class are cloneable with this method.

Usage

learner$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Author(s)

Klaus Kähler Holst, Benedikt Sommer

Examples

data(iris)rf <- function(formula, ...) {  learner$new(formula,    info = "grf::probability_forest",    estimate = function(x, y, ...) {      grf::probability_forest(X = x, Y = y, ...)    },    predict = function(object, newdata) {      predict(object, newdata)$predictions    },    estimate.args = list(...)  )}args <- expand.list(  num.trees = c(100, 200), mtry = 1:3,  formula = c(Species ~ ., Species ~ Sepal.Length + Sepal.Width))models <- lapply(args, function(par) do.call(rf, par))x <- models[[1]]$clone()x$estimate(iris)predict(x, newdata = head(iris))# Reduce Ex. timinga <- targeted::cv(models, data = iris)cbind(coef(a), attr(args, "table"))# defining learner via function with arguments y (response)# and x (design matrix)f1 <- learner$new(  estimate = function(y, x) lm.fit(x = x, y = y),  predict = function(object, newdata) newdata %*% object$coefficients)# defining the learner via arguments formula and dataf2 <- learner$new(  estimate = function(formula, data, ...) glm(formula, data, ...))# generic learner defined from function (predict method derived per default# from stats::predictf3 <- learner$new(  estimate = function(dt, ...) {    lm(y ~ x, data = dt)  })## ------------------------------------------------## Method `learner$summary`## ------------------------------------------------lr <- learner_glm(y ~ x, family = "nb")lr$summary()lr_sum <- lr$summary() # store returned summary in new objectnames(lr_sum)print(lr_sum)

Construct learners from a grid of parameters

Description

Construct learners from a grid of parameters

Usage

learner_expand_grid(fun, args, names = TRUE, params = FALSE)

Arguments

fun

(function) A function that returns alearner.

args

(list) Parameters that generate a grid of parameters withexpand.list, where the set of parameters are then passed on tofun.

names

(logical or character) If FALSE, then return a list withoutnames. If TRUE, a named list is returned (see details).

params

(logical) If FALSE, then no information about the parametersdefined byargs are added to the names of the returned list.

Value

list

Examples

lrs <- learner_expand_grid(  learner_xgboost,  list(formula = Sepal.Length ~ ., eta = c(0.2, 0.5, 0.3)))lrs # use info of constructed learner as nameslrs <- learner_expand_grid(  learner_xgboost,  list(formula = Sepal.Length ~ ., eta = c(0.2, 0.5, 0.3)),  names = "xgboost")names(lrs) # use xgboost instead of info field for nameslrs <- learner_expand_grid(  learner_xgboost,  list(formula = Sepal.Length ~ ., eta = c(0.2, 0.5, 0.3)),  names = "xgboost",  params = TRUE)names(lrs) # also add parameters to nameslrs <- learner_expand_grid(  learner_xgboost,  list(formula = Sepal.Length ~ ., eta = c(0.2, 0.5, 0.3)),  names = FALSE)names(lrs) # unnamed list since names = FALSE

Construct a learner

Description

Constructslearner class object for fitting generalizedadditive models withmgcv::gam.

Usage

learner_gam(  formula,  info = "mgcv::gam",  family = gaussian(),  select = FALSE,  gamma = 1,  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

family

This is a family object specifying the distribution and link to use infitting etc (seeglm andfamily). Seefamily.mgcv for a full list of what is available, which goes well beyond exponential family.Note thatquasi families actually result in the use of extended quasi-likelihood ifmethod is set to a RE/ML method (McCullagh and Nelder, 1989, 9.6).

select

If this isTRUE thengam can add an extra penalty to each term so that it can be penalized to zero. This means that the smoothing parameter estimation that is part of fitting can completely remove terms from the model. If the corresponding smoothing parameter is estimated as zero then the extra penalty has no effect. Usegamma to increase level of penalization.

gamma

Increase this beyond 1 to produce smoother models.gamma multiplies the effective degrees of freedom in the GCV or UBRE/AIC.n/gamma can be viewed as an effective sample size in the GCV score, and this also enables it to be used with REML/ML. Ignored with P-RE/ML or theefs optimizer.

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments tomgcv::gam.

Value

learner object.

Examples

n <- 5e2x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)d0 <- data.frame(y, x1, x2)lr <- learner_gam(y ~ s(x1) + x2)lr$estimate(d0)if (interactive()) {  plot(lr$fit)}

Construct a learner

Description

Constructs alearner class object for fitting generalizedlinear models withstats::glm andMASS::glm.nb. Negative binomialregression is supported withfamily = "nb" (or alternativelyfamily = "negbin").

Usage

learner_glm(  formula,  info = "glm",  family = gaussian(),  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

family

a description of the error distribution and linkfunction to be used in the model. Forglm this can be acharacter string naming a family function, a family function or theresult of a call to a family function. Forglm.fit only thethird option is supported. (Seefamily for details offamily functions.)

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments tostats::glm orMASS::glm.nb.

Value

learner object.

Examples

n <- 5e2x <- rnorm(n)w <- 50 + rexp(n, rate = 1 / 5)y <- rpois(n, exp(2 + 0.5 * x + log(w)) * rgamma(n, 1 / 2, 1 / 2))d0 <- data.frame(y, x, w)lr <- learner_glm(y ~ x) # linear Gaussian modellr$estimate(d0)coef(lr$fit)# negative binomial regression model with offset (using MASS::glm.nb)lr <- learner_glm(y ~ x + offset(log(w)), family = "nb")lr$estimate(d0)coef(lr$fit)lr$predict(data.frame(x = 1, w = c(1, 5))) # response scalelr$predict(data.frame(x = 1, w = c(1, 5)), type = "link") # link scale

Construct a learner

Description

Constructs alearner class object for fitting entire lasso orelastic-net regularization paths for various linear and non-linear regressionmodels withglmnet::cv.glmnet. Predictions are returned for the value oflambda that gives minimumcvm. That is,glmnet::predict.cv.glmnet iscalled withs = "lambda.min".

Usage

learner_glmnet_cv(  formula,  info = "glmnet::cv.glmnet",  family = gaussian(),  lambda = NULL,  alpha = 1,  nfolds = 10,  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

family

Either a character string representingone of the built-in families, or else aglm() family object. For moreinformation, see Details section below or the documentation for responsetype (above).

lambda

Optional user-supplied lambda sequence; default isNULL, andglmnet chooses its own sequence. Note that this is donefor the full model (master sequence), and separately for each fold.The fits are then alligned using the master sequence (see theallignmentargument for additional details). Adaptinglambda for each foldleads to better convergence. Whenlambda is supplied, the same sequenceis used everywhere, but in some GLMs can lead to convergence issues.

alpha

The elasticnet mixing parameter, with0\le\alpha\le 1.The penalty is defined as

(1-\alpha)/2||\beta||_2^2+\alpha||\beta||_1.

alpha=1 is thelasso penalty, andalpha=0 the ridge penalty.

nfolds

number of folds - default is 10. Althoughnfolds can beas large as the sample size (leave-one-out CV), it is not recommended forlarge datasets. Smallest value allowable isnfolds=3

learner.args

(list) Additional arguments tolearner$new().

...

Other arguments that can be passed to glmnet, for examplealpha,nlambda, etc. Seeglmnet for details.

Value

learner object.

Examples

# continuous outcomen <- 5e2x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)lp <- x1 + x2*x1 + cos(x1)y <- rnorm(n, lp, sd = 2)d0 <- data.frame(y, x1, x2)lr <- learner_glmnet_cv(y ~ x1 + x2)lr$estimate(d0, nfolds = 3)lr$predict(data.frame(x1 = c(0, 1), x2 = 1))# count outcome with different exposure timew <- 50 + rexp(n, rate = 1 / 5)y <- rpois(n, exp(0.5 * x1 - 1 * x2 + log(w)) * rgamma(n, 1 / 2, 1 / 2))d0 <- data.frame(y, x1, x2, w)lr <- learner_glmnet_cv(y ~ x1 + x2 + offset(log(w)), family = "poisson")lr$estimate(d0, nfolds = 3)lr$predict(data.frame(x1 = 1, x2 = 1, w = c(1, 5)))

Construct a learner

Description

Constructs alearner class object for fitting generalizedrandom forest models withgrf::regression_forest orgrf::probability_forest. As shown in the examples, the constructed learnerreturns predicted class probabilities of class 2 in case of binaryclassification. A⁠n times p⁠ matrix, withn being the number ofobservations andp the number of classes, is returned for multi-classclassification.

Usage

learner_grf(  formula,  num.trees = 2000,  min.node.size = 5,  alpha = 0.05,  sample.fraction = 0.5,  num.threads = 1,  model = "grf::regression_forest",  info = model,  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

num.trees

Number of trees grown in the forest. Note: Getting accurateconfidence intervals generally requires more trees thangetting accurate predictions. Default is 2000.

min.node.size

A target for the minimum number of observations in each tree leaf. Note that nodeswith size smaller than min.node.size can occur, as in the original randomForest package.Default is 5.

alpha

A tuning parameter that controls the maximum imbalance of a split. Default is 0.05.

sample.fraction

Fraction of the data used to build each tree.Note: If honesty = TRUE, these subsamples willfurther be cut by a factor of honesty.fraction. Default is 0.5.

num.threads

Number of threads used in training. By default, the number of threads is setto the maximum hardware concurrency.

model

(character) grf model to estimate. Usually regression_forest(grf::regression_forest) or probability_forest (grf::probability_forest).

info

(character) Optional information to describe the instantiatedlearner object.

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments tomodel

Value

learner object.

Examples

n <- 5e2x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)lp <- x2*x1 + cos(x1)yb <- rbinom(n, 1, lava::expit(lp))y <-  lp + rnorm(n, sd = 0.5**.5)d <- data.frame(y, yb, x1, x2)# regressionlr <- learner_grf(y ~ x1 + x2)lr$estimate(d)lr$predict(head(d))# binary classificationlr <- learner_grf(as.factor(yb) ~ x1 + x2, model = "probability_forest")lr$estimate(d)lr$predict(head(d)) # predict class probabilities of class 2# multi-class classificationlr <- learner_grf(Species ~ ., model = "probability_forest")lr$estimate(iris)lr$predict(head(iris))

Construct a learner

Description

Constructs alearner class object for fitting a highlyadaptive lasso model withhal9001::fit_hal.

Usage

learner_hal(  formula,  info = "hal9001::fit_hal",  smoothness_orders = 0,  reduce_basis = NULL,  family = "gaussian",  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

smoothness_orders

Aninteger, specifying the smoothness of thebasis functions. See details forsmoothness_orders for moreinformation.

reduce_basis

Am optionalnumeric value bounded in the openunit interval indicating the minimum proportion of 1's in a basis functioncolumn needed for the basis function to be included in the procedure to fitthe lasso. Any basis functions with a lower proportion of 1's than thecutoff will be removed. Defaults to 1 over the square root of the number ofobservations. Only applicable for models fit with zero-order splines, i.e.smoothness_orders = 0.

family

Acharacter or afamily object(supported byglmnet) specifying the error/linkfamily for a generalized linear model.character options are limitedto "gaussian" for fitting a standard penalized linear model, "binomial" forpenalized logistic regression, "poisson" for penalized Poisson regression,"cox" for a penalized proportional hazards model, and "mgaussian" formultivariate penalized linear model. Note that passing infamily objects leads to slower performance relative to passing in acharacter family (if supported). For example, one should setfamily = "binomial" instead offamily = binomial() whencallingfit_hal.

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments tohal9001::fit_hal.

Value

learner object.

Examples

## Not run: n <- 5e2x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)d <- data.frame(y, x1, x2)lr <- learner_hal(y ~ x1 + x2, smoothness_orders = 0.5, reduce_basis = 1)lr$estimate(d)lr$predict(data.frame(x1 = 0, x2 = c(-1, 1)))## End(Not run)

Construct a learner

Description

Constructs alearner class object for isotonic regression withisoregw.

Usage

learner_isoreg(formula, info = "targeted::isoregw", learner.args = NULL, ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments toisoregw.

Value

learner object.

Examples

x <- runif(5e3, -5, 5)pr <- lava::expit(-1 + x)y <- rbinom(length(pr), 1, pr)d <- data.frame(y, x)lr <- learner_isoreg(y ~ x)lr$estimate(d)pr_iso <- lr$predict(d)if (interactive()) {  plot(pr ~ x, cex=0.3)  lines(sort(x), pr_iso[order(x)], col="red", type="s")}

Construct a learner

Description

Constructs alearner class object for fitting multivariateadaptive regression splines withearth::earth.

Usage

learner_mars(  formula,  info = "earth::earth",  degree = 1,  nprune = NULL,  glm = NULL,  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

degree

Maximum degree of interaction (Friedman'smi).Default is1, meaning build an additive model (i.e., no interaction terms).

nprune

Maximum number of terms (including intercept) in the pruned model.Default is NULL, meaning all terms created by the forward pass(but typically not all terms will remain after pruning).Use this to enforce an upper bound on the model size (that is less thannk),or to reduce exhaustive search time withpmethod="exhaustive".

The following arguments are for cross validation.

glm

NULL (default) or a list of arguments to pass on toglm.See the documentation ofglm for a description of these argumentsSee “Generalized linear models” in the vignette.Example:
earth(survived~., data=etitanic, degree=2, glm=list(family=binomial))

The following arguments are for the forward pass.

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments toearth::earth.

Value

learner object.

Examples

# poisson regressionn <- 5e2x <- rnorm(n)w <- 50 + rexp(n, rate = 1 / 5)y <- rpois(n, exp(2 + 0.5 * x + log(w)) * rgamma(n, 1 / 2, 1 / 2))d0 <- data.frame(y, x, w)lr <- learner_mars(y ~ x + offset(log(w)), degree = 2,  glm = list(family = poisson()))lr$estimate(d0)lr$predict(data.frame(x = 0, w = c(1, 2)))

Construct a learner

Description

Constructs alearner class object for fitting a naive bayesclassifier withnaivebayes. As shown in the examples, the constructedlearner returns predicted class probabilities of class 2 in case of binaryclassification. A⁠n times p⁠ matrix, withn being the number ofobservations andp the number of classes, is returned for multi-classclassification.

Usage

learner_naivebayes(  formula,  info = "Naive Bayes",  laplace.smooth = 0,  kernel = FALSE,  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

laplace.smooth

Laplace smoothing

kernel

If TRUE a kernel estimator is used for numeric predictors(otherwise a gaussian model is used)

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments tonaivebayes.

Value

learner object.

Examples

n <- 5e2x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)y <- rbinom(n, 1, lava::expit(x2*x1 + cos(x1)))d <- data.frame(y, x1, x2)# binary classificationlr <- learner_naivebayes(y ~ x1 + x2)lr$estimate(d)lr$predict(head(d))# multi-class classificationlr <- learner_naivebayes(Species ~ .)lr$estimate(iris)lr$predict(head(iris))

Construct a learner

Description

Constructs alearner class object for fitting asuperlearner.

Usage

learner_sl(  learners,  info = NULL,  nfolds = 5L,  meta.learner = metalearner_nnls,  model.score = mse,  learner.args = NULL,  ...)

Arguments

learners

(list) List oflearner objects (i.e.learner_glm)

info

(character) Optional information to describe the instantiatedlearner object.

nfolds

(integer) Number of folds to use in cross-validation toestimate the ensemble weights.

meta.learner

(function) Algorithm to learn the ensemble weights(default non-negative least squares). Must be a function of the response(nx1 vector),y, and the predictions (nxp matrix),pred, with p beingthe number of learners. Alternatively, this can be set to the charactervalue "discrete", in which case the Discrete Super-Learner is applied wherethe model with the lowest risk (model-score) is given weight 1 and allother learners weight 0.

model.score

(function) Model scoring method (seelearner)

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments tosuperlearner

Value

learner object.

Examples

sim1 <- function(n = 5e2) {   x1 <- rnorm(n, sd = 2)   x2 <- rnorm(n)   y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)   data.frame(y, x1, x2)}d <- sim1()m <- list(  "mean" = learner_glm(y ~ 1),  "glm" = learner_glm(y ~ x1 + x2),  "iso" = learner_isoreg(y ~ x1))s <- learner_sl(m, nfolds = 10)s$estimate(d)pr <- s$predict(d)if (interactive()) {    plot(y ~ x1, data = d)    points(d$x1, pr, col = 2, cex = 0.5)    lines(cos(x1) + x1 ~ x1, data = d[order(d$x1), ],          lwd = 4, col = lava::Col("darkblue", 0.3))}print(s)# weights(s$fit)# score(s$fit)cvres <- cv(s, data = d, nfolds = 3, rep = 2)cvres# coef(cvres)# score(cvres)

Construct stratified learner

Description

This function creates a stratified learner from an existinglearner wrapper function such aslearner_glm orlearner_xgboost. Thestratification variable can be specified either using thestratifyargument (which can be given as a string "a" or a formula , for example ~I(a==0)), or it can be defined as a special term directly in the formula, y~ ... + stratify(a). The formula will subsequently be passed to thelearner_ wrapper without the stratify special term.

Usage

learner_stratify(  formula,  learner,  stratify = NULL,  info = NULL,  learner.args = list(),  ...)

Arguments

formula

formula specifying outcome and design matrix

learner

(learner)learner object

stratify

(character,formula) variables to stratify by

info

optional description of the model

learner.args

(list) optional arguments to the learner constructor

...

additional arguments passed to the learner constructor

Value

learner object

Examples

simdata <- function(n=1000) {  a <- rbinom(n, 1, 0.5)  x <- rnorm(n)  y <- rbinom(n, 1, plogis(-1 + a + a * x))  data.frame(y, a, x)}d <- simdata()lr <- learner_stratify(  y ~ x + stratify(a),  learner_glm,  family=binomial())lr$estimate(d)lr$predict(head(d))

Construct a learner

Description

Constructs alearner class object for fitting support vectormachines withe1071::svm. As shown in the examples, the constructed learnerreturns predicted class probabilities of class 2 in case of binaryclassification. A⁠n times p⁠ matrix, withn being the number ofobservations andp the number of classes, is returned for multi-classclassification.

Usage

learner_svm(  formula,  info = "e1071::svm",  cost = 1,  epsilon = 0.1,  kernel = "radial",  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

info

(character) Optional information to describe the instantiatedlearner object.

cost

cost of constraints violation (default: 1)—it is the‘C’-constant of the regularization term in the Lagrange formulation.

epsilon

epsilon in the insensitive-loss function (default: 0.1)

kernel

the kernel used in training and predicting. Youmight consider changing some of the following parameters, dependingon the kernel type.

linear:: u'v
polynomial:: (\gamma u'v + coef0)^{degree}
radial basis:: e^(-\gamma |u-v|^2)
sigmoid:: tanh(\gamma u'v + coef0)

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments toe1071::svm.

Value

learner object.

Examples

n <- 5e2x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)lp <- x2*x1 + cos(x1)yb <- rbinom(n, 1, lava::expit(lp))y <-  lp + rnorm(n, sd = 0.5**.5)d <- data.frame(y, yb, x1, x2)# regressionlr <- learner_svm(y ~ x1 + x2)lr$estimate(d)lr$predict(head(d))# binary classificationlr <- learner_svm(as.factor(yb) ~ x1 + x2)# alternative to transforming response variable to factor# lr <- learner_svm(yb ~ x1 + x2, type = "C-classification")lr$estimate(d)lr$predict(head(d)) # predict class probabilities of class 2lr$predict(head(d), probability = FALSE) # predict labels# multi-class classificationlr <- learner_svm(Species ~ .)lr$estimate(iris)lr$predict(head(iris))

Construct a learner

Description

Constructs alearner class object forxgboost::xgboost.

Usage

learner_xgboost(  formula,  max_depth = 2L,  learning_rate = 1,  nrounds = 2L,  subsample = 1,  reg_lambda = 1,  objective = "reg:squarederror",  info = paste("xgboost", objective),  learner.args = NULL,  ...)

Arguments

formula

(formula) Formula specifying response and design matrix.

max_depth

(integer) Maximum depth of a tree.

learning_rate

(numeric) Learning rate.

nrounds

Number of boosting iterations / rounds.

Note that the number of default boosting rounds here is not automatically tuned, and differentproblems will have vastly different optimal numbers of boosting rounds.

subsample

(numeric) Subsample ratio of the training instance.

reg_lambda

(numeric) L2 regularization term on weights.

objective

(character) Specify the learning task and the correspondinglearning objective. Seexgboost::xgboost for all available options.

info

(character) Optional information to describe the instantiatedlearner object.

learner.args

(list) Additional arguments tolearner$new().

...

Additional arguments toxgboost::xgboost.

Value

learner object.

Examples

n  <- 1e3x1 <- rnorm(n, sd = 2)x2 <- rnorm(n)lp <- x2*x1 + cos(x1)yb <- rbinom(n, 1, lava::expit(lp))y <-  lp + rnorm(n, sd = 0.5**.5)d0 <- data.frame(y, yb, x1, x2)# regressionlr <- learner_xgboost(y ~ x1 + x2, nrounds = 5)lr$estimate(d0)lr$predict(head(d0))# binary classificationlr <- learner_xgboost(yb ~ x1 + x2, nrounds = 5, objective = "binary:logistic")lr$estimate(d0)lr$predict(head(d0))# multi-class classificationd0 <- irisd0$y <- as.numeric(d0$Species)- 1lr <- learner_xgboost(y ~ ., objective = "multi:softprob", num_class = 3)lr$estimate(d0)lr$predict(head(d0))

R6 class for prediction models

Description

Replaced bylearner

Super class

targeted::learner ->ml_model

Methods

Public methods

Inherited methods

Method`new()`

Create a new prediction model object

Usage

ml_model$new(...)

Arguments

...: deprecated

Method`clone()`

The objects of this class are cloneable with this method.

Usage

ml_model$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Naive Bayes classifier

Description

Naive Bayes Classifier

Usage

naivebayes(  formula,  data,  weights = NULL,  kernel = FALSE,  laplace.smooth = 0,  prior = NULL,  ...)

Arguments

formula

Formula with syntax: response ~ predictors | weights

data

data.frame

weights

optional frequency weights

kernel

If TRUE a kernel estimator is used for numeric predictors(otherwise a gaussian model is used)

laplace.smooth

Laplace smoothing

prior

optional prior probabilities (default estimated from data)

...

additional arguments to lower level functions

Value

An object of class 'naivebayes' is returned. Seenaivebayes-class for more details about this class andits generic functions.

Author(s)

Klaus K. Holst

Examples

library(data.table)data(iris)m <- naivebayes(Species ~ Sepal.Width + Petal.Length, data = iris)pr <- predict(m, newdata = iris)# using weights to reduce the size of the datasetn <- 5e2x <- rnorm(n, sd = 2) > 0y <- rbinom(n, 1, lava::expit(x))# full data setd1 <- data.frame(y, x = as.factor(x > 0))m1 <- naivebayes(y ~ x, data = d1)# reduced data setd2 <- data.table(d1)[, .(.N), by = .(y, x)]m2 <- naivebayes(y ~ x, data = d2, weights = d2$N)all(predict(m1, d1) == predict(m2, d1))

naivebayes class object

Description

The functionsnaivebayes returns an object of the typenaivebayes.

An object of class 'naivebayes' is a list with at leastthe following components:

prior: Matrix with prior probabilities,i.e. marginal class probabilities Pr(class)
pcond: list of matrices with conditional probabilitiesof the features given the classes (one list element per class), Pr(x|class)
classes: Names (character vector) of the classes
xvar: Names of predictors
xmodel: Conditional model for each predictor
design: Model design object
call: The function call which instantiated the object

Value

objects of the S3 class 'naivebayes'

S3 generics

The following S3 generic functions are available for an objectof classnaivebayes:

predict: Predict class probabilities for new features data.
print: Basic print method.

Examples

## See example(naivebayes) for examples

Find non-dominated points of a set

Description

Find the non-dominated point of a set (minima of a point set).

Usage

nondom(x, ...)

Arguments

x

matrix

...

additional arguments to lower level functions

Details

A point x dominates y if it is never worse and at leastin one case strictly better.Formally, let f_i denote the ith coordinate of thecondition (objective) function,then for all i: f_i(x)<=f_i(y) and there exists j: f_j(x)<f_j(y).

Based on the algorithm of Kung et al. 1975.

Value

matrix

Author(s)

Klaus Kähler Holst

Examples

rbind(  c(1.0, 0.5),  c(0.0, 1.0),  c(1.0, 0.0),  c(0.5, 1.0),  c(1.0, 1.0),  c(0.8, 0.8)) |> nondom()

Pooled Adjacent Violators Algorithm

Description

Pooled Adjacent Violators Algorithm

Usage

pava(y, x = numeric(0), weights = numeric(0))

Arguments

y

response variable

x

(optional) predictor vector (otherwise y is assumedto be a priori sorted according to relevant predictor)

weights

weights (optional) weights

Value

List with index (idx) of jump points and values (value)at each jump point.

Author(s)

Klaus K. Holst

Examples

x <- runif(5e3, -5, 5)pr <- lava::expit(-1 + x)y <- rbinom(length(pr), 1, pr)pv <- pava(y, x)plot(pr ~ x, cex=0.3)with(pv, lines(sort(x)[index], value, col="red", type="s"))

Prediction for kernel density estimates

Description

Kernel density estimator predictions

Usage

## S3 method for class 'density'predict(object, xnew, ...)

Arguments

object

density object

xnew

New data on which to make predictions for

...

additional arguments to lower level functions

Author(s)

Klaus K. Holst

Predictions for Naive Bayes Classifier

Description

Naive Bayes Classifier predictions

Usage

## S3 method for class 'naivebayes'predict(object, newdata, expectation = NULL, threshold = c(0.001, 0.001), ...)

Arguments

object

density object

newdata

new data on which to make predictions

expectation

Variable to calculate conditional expectation wrtprobabilities from naivebayes classifier

threshold

Threshold parameters. First element defines the thresholdon the probabilities and the second element the value to set thosetruncated probabilities to.

...

Additional arguments to lower level functions

Author(s)

Klaus K. Holst

Predict Method for superlearner Fits

Description

Obtains predictions for ensemble model or individual learners.

Usage

## S3 method for class 'superlearner'predict(object, newdata, all.learners = FALSE, ...)

Arguments

object

(superlearner) Fittedsuperlearner object.

newdata

(data.frame) Data in which to look for variables with which topredict.

all.learners

(logical) If FALSE (default), then return the predictionsfrom the ensemble model. Otherwise, return predictions of from all individuallearners.

...

Not used.

Value

numeric (all.learners = FALSE) or matrix (all.learners = TRUE)

Objects exported from other packages

Description

These objects are imported from other packages. Follow the linksbelow to see their documentation.

lava: estimate,IC,parameter,score,sim
survival: strata,Surv

Risk regression

Description

Risk regression with binary exposure and nuisance model for theodds-product.

LetA be the binary exposure,V the set of covariates, andY the binary response variable, and definep_a(v) = P(Y=1 \midA=a, V=v), a\in\{0,1\}.

Thetarget parameter is either therelative risk

\mathrm{RR}(v) = \frac{p_1(v)}{p_0(v)}

or therisk difference

\mathrm{RD}(v) = p_1(v)-p_0(v)

We assume a target parameter model given by either

\log\{RR(v)\} =\alpha^t v

\mathrm{arctanh}\{RD(v)\} = \alpha^tv

and similarly a working linearnuisancemodel for theodds-product

\phi(v) =\log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right) = \beta^tv

Apropensity model forE(A=1|V) is also fitted using a logisticregression working model

\mathrm{logit}\{E(A=1\mid V=v)\} = \gamma^tv.

If both the odds-product model and the propensity model are correct theestimator is efficient. Further, the estimator is consistent in the unionmodel, i.e., the estimator is double-robust in the sense that only one ofthe two models needs to be correctly specified to get a consistent estimate.

Usage

riskreg(  formula,  nuisance = ~1,  propensity = ~1,  target = ~1,  data,  weights,  type = "rr",  optimal = TRUE,  std.err = TRUE,  start = NULL,  mle = FALSE,  ...)

Arguments

formula

formula (see details below)

nuisance

nuisance model (formula)

propensity

propensity model (formula)

target

(optional) target model (formula)

data

data.frame

weights

optional weights

type

type of association measure (rd og rr)

optimal

If TRUE optimal weights are calculated

std.err

If TRUE standard errors are calculated

start

optional starting values

mle

Semi-parametric (double-robust) estimate or MLE (TRUE gives MLE)

...

additional arguments to unconstrained optimization routine(nlminb)

Details

E.g.,riskreg(y ~ a | 1 | x+z | x+z, data=...)

Alternatively, the model can specifed using the target, nuisance andpropensity arguments:riskreg(y ~ a, target=~1, nuisance=~x+z, ...)

Theriskreg_fit function can be used with matrix inputs rather thanformulas.

Value

An object of class 'riskreg.targeted' is returned. Seetargeted-class for more details about this class and itsgeneric functions.

Author(s)

Klaus K. Holst

References

Richardson, T. S., Robins, J. M., & Wang, L. (2017). On modelingand estimation for the relative risk and risk difference. Journal of theAmerican Statistical Association, 112(519), 1121–1130.http://dx.doi.org/10.1080/01621459.2016.1192546

Examples

m <- lava::lvm(a[-2] ~ x,         z ~ 1,         lp.target[1] ~ 1,         lp.nuisance[-1] ~ 2*x) |>     lava::distribution(~a, value=lava::binomial.lvm("logit")) |>     lava::binomial.rr("y","a","lp.target","lp.nuisance")d <- sim(m,5e2,seed=1)I <- model.matrix(~1, d)X <- model.matrix(~1+x, d)with(d, riskreg_mle(y, a, I, X, type="rr"))with(d, riskreg_fit(y, a, nuisance=X, propensity=I, type="rr"))riskreg(y ~ a | 1, nuisance=~x ,  data=d, type="rr")## Model with same design matrix for nuisance and propensity model:with(d, riskreg_fit(y, a, nuisance=X, type="rr"))## a <- riskreg(y ~ a, target=~z, nuisance=~x,## propensity=~x, data=d, type="rr")a <- riskreg(y ~ a | z, nuisance=~x,  propensity=~x, data=d, type="rr")apredict(a, d[1:5,])riskreg(y ~ a, nuisance=~x,  data=d, type="rr", mle=TRUE)

Binary regression models with right censored outcomes

Description

Binary regression models with right censored outcomes

Usage

riskreg_cens(  response,  censoring,  treatment = NULL,  prediction = NULL,  data,  newdata,  tau,  type = "risk",  M = 1,  call.response = "phreg",  args.response = list(),  call.censoring = "phreg",  args.censoring = list(),  preprocess = NULL,  efficient = TRUE,  control = list(),  ...)

Arguments

response

Response formula (e.g., Surv(time, event) ~ D + W).

censoring

Censoring formula (e.g., Surv(time, event == 0) ~ D + A +W)).

treatment

Optional treatment model (learner)

prediction

Optional prediction model (learner)

data

data.frame.

newdata

Optional data.frame. In this case the uncentered influencefunction evaluated in 'newdata' is returned with nuisance parametersobtained from 'data'.

tau

Time-point of interest, see Details.

type

"risk", "treatment", "brier"

M

Number of folds in cross-fitting (M=1 is no cross-fitting).

call.response

Model call for the response model (e.g. "mets::phreg").

args.response

Additional arguments to the response model.

call.censoring

Similar to call.response.

args.censoring

Similar to args.response.

preprocess

(optional) Data pre-processing function.

efficient

If FALSE an IPCW estimator is returned

control

See details

...

Additional arguments to lower level data pre-processingfunctions.

Details

The one-step estimator depends on the calculation of an integralwrt. the martingale process corresponding to the counting process N(t) =I(C>min(T,tau)). This can be decomposed into an integral wrt the countingprocess,dN_c(t) and the compensatord\Lambda_c(t) where thelatter term can be computational intensive to calculate. Rather thancalculating this integral in all observed time points, we can make acoarser evaluation which can be controlled by settingcontrol=(sample=N). WithN=0 the (computational intensive)standard evaluation is used.

Value

estimate object

Author(s)

Klaus K. Holst, Andreas Nordland

Extract average cross-validated score of individual learners

Description

Extract average cross-validated score of individual learners

Usage

## S3 method for class 'superlearner'score(x, ...)

Arguments

x

(superlearner) Fitted model.

...

Not used.

Predictive model scoring

Description

Predictive model scoring

Usage

scoring(  response,  ...,  type = "quantitative",  levels = NULL,  metrics = NULL,  weights = NULL,  names = NULL,  object = NULL,  newdata = NULL,  messages = 1)

Arguments

response

Observed response

...

model predictions (continuous predictions or class probabilities(matrices))

type

continuous or categorical response (the latter is automaticallychosen if response is a factor, otherwise a continuous response is assumed)

levels

(optional) unique levels in response variable

metrics

which metrics to report

weights

optional frequency weights

names

(optional) character vector of the model names in the output. Ifomitted these will be taken from the names of the ellipsis argument (...)

object

optional model object

newdata

(optional) data.frame on which to evaluate the modelperformance

messages

controls amount of messages/warnings (0: none)

Value

Numeric matrix of dimension m x p, where m is the number ofdifferent models and p is the number of model metrics

Examples

data(iris)set.seed(1)dat <- lava::csplit(iris,2)g1 <- naivebayes(Species ~ Sepal.Width + Petal.Length, data=dat[[1]])g2 <- naivebayes(Species ~ Sepal.Width, data=dat[[1]])pr1 <- predict(g1, newdata=dat[[2]], wide=TRUE)pr2 <- predict(g2, newdata=dat[[2]], wide=TRUE)table(colnames(pr1)[apply(pr1,1,which.max)], dat[[2]]$Species)table(colnames(pr2)[apply(pr2,1,which.max)], dat[[2]]$Species)scoring(dat[[2]]$Species, pr1=pr1, pr2=pr2)## quantitative response:scoring(response=1:10, prediction=rnorm(1:10))

Softmax transformation

Description

Softmax transformation

Usage

softmax(x, log = FALSE, ref = TRUE, ...)

Arguments

x

Input matrix (e.g., linear predictors of multinomial logistic model)

log

Return on log-scale (default FALSE)

ref

Add reference level (add 0 column to x)

...

Additional arguments to lower level functions

Value

Numeric matrix of dimension n x p, wheren= nrow(x) andp = ncol(x) + (ref==TRUE)

Solve ODE

Description

Solve ODE with Runge-Kutta method (RK4)

Usage

solve_ode(ode_ptr, input, init, par = 0)

Arguments

ode_ptr

pointer (externalptr) to C++ function or an R function

input

Input matrix. 1st column specifies the time points

init

Initial conditions

par

Parameters defining the ODE (parsed to ode_ptr)

Details

The external point should be created with the functiontargeted::specify_ode.

Value

Matrix with solution

Author(s)

Klaus Kähler Holst

Examples

example(specify_ode)

Specify Ordinary Differential Equation (ODE)

Description

Define compiled code for ordinary differential equation.

Usage

specify_ode(code, fname = NULL, pname = c("dy", "x", "y", "p"))

Arguments

code

string with the body of the function definition (see details)

fname

Optional name of the exported C++ function

pname

Vector of variable names (results, inputs, states, parameters)

Details

The model (code) should be specified as the body of of C++ function.The following variables are defined bye default(see the argumentpname)

dy: Vector with derivatives, i.e. the rhs of the ODE (the result).
x: Vector with the first element being the time, and the followingelements additional exogenous input variables,
y: Vector with the dependent variable
p: Parameter vector

y'(t) = f_{p}(x(t), y(t))All variables are treated as Armadillo (http://arma.sourceforge.net/)vectors/matrices.

As an example consider theLorenz Equations\frac{dx_{t}}{dt} = \sigma(y_{t}-x_{t})\frac{dy_{t}}{dt} = x_{t}(\rho-z_{t})-y_{t}\frac{dz_{t}}{dt} = x_{t}y_{t}-\beta z_{t}

We can specify this model asode <- 'dy(0) = p(0)*(y(1)-y(0)); dy(1) = y(0)*(p(1)-y(2)); dy(2) = y(0)*y(1)-p(2)*y(2);'dy <- specify_ode(ode)

As an example of model with exogenous inputs consider the following ODE:y'(t) = \beta_{0} + \beta_{1}y(t) +\beta_{2}y(t)x(t) + \beta_{3}x(t)\cdot tThis could be specified asmod <- 'double t = x(0); dy = p(0) + p(1)*y + p(2)*x(1)*y + p(3)*x(1)*t;'dy <- specify_ode(mod)

Value

pointer (externalptr) to C++ function

Author(s)

Klaus Kähler Holst

Examples

ode <- paste0(  "dy(0) = p(0)*(y(1)-y(0));",  "dy(1) = y(0)*(p(1)-y(2));",  "dy(2) = y(0)*y(1)-p(2)*y(2);", collapse="\n") # Reduce test timedy <- specify_ode(ode)tt <- seq(0, 100, length.out=2e4)yy <- solve_ode(dy, input=tt, init=c(1, 1, 1), par=c(10, 28, 8/3))

Identify Stratification Variables

Description

This is a special function that identifies stratificationvariables when they appear on the right hand side of a formula.

Usage

stratify(..., na.group = FALSE, shortlabel, sep = ", ")

Arguments

...

any number of variables. All must be the same length.

na.group

a logical variable, ifTRUE, then missing values are treated as a distinct level of each variable.

shortlabel

ifTRUE omit variable names from resultingfactor labels. The default action is to omit the names if all of thearguments are factors, and none of them was named.

sep

the character used to separate groups, in the created label

Details

When used outside of acoxph formula the result of the functionis essentially identical to theinteraction function, though the labels fromstrata are often more verbose.

Value

a new factor, whose levels are all possible combinations of the factors supplied as arguments.

Examples

a <- factor(rep(1:3, 4), labels=c("low", "medium", "high"))b <- factor(rep(1:4, 3))levels(stratify(b))levels(stratify(a, b, shortlabel=TRUE))

Superlearner (stacked/ensemble learner)

Description

This function creates a predictor object (classlearner) froma list of existinglearner objects. When estimating this model a stackedprediction will be created by weighting together the predictions of each ofthe initial learners The weights are learned using cross-validation.

Usage

superlearner(  learners,  data,  nfolds = 10,  meta.learner = metalearner_nnls,  model.score = mse,  mc.cores = NULL,  future.seed = TRUE,  silent = TRUE,  name.prefix = NULL,  ...)

Arguments

learners

(list) List oflearner objects (i.e.learner_glm)

data

(data.frame) Data containing the response variable andcovariates.

nfolds

(integer) Number of folds to use in cross-validation toestimate the ensemble weights.

meta.learner

model.score

(function) Model scoring method (seelearner)

mc.cores

(integer) If not NULL, thenparallel::mcmapply is used withmc.cores number of cores for parallelization instead of thefuture.apply::future_lapply package. Parallelization is disabled withmc.cores = 1.

future.seed

(logical or integer) Argument passed on tofuture.apply::future_lapply. If TRUE, then.Random.seed is used if itholds a L'Ecuyer-CMRG RNG seed, otherwise one is created randomly.

silent

(logical) Suppress all messages and progressbars

name.prefix

(character) Prefix used to name learner objects inlearners without names. If NULL, then obtain the name from the info fieldof a learner.

...

Additional arguments toparallel::mclapply orfuture.apply::future_lapply.

References

Luedtke & van der Laan (2016) Super-Learning of an OptimalDynamic Treatment Rule, The International Journal of Biostatistics.

Examples

sim1 <- function(n = 5e2) {   x1 <- rnorm(n, sd = 2)   x2 <- rnorm(n)   y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)   data.frame(y, x1, x2)}m <- list(  "mean" = learner_glm(y ~ 1),  "glm" = learner_glm(y ~ x1 + x2))sl <- superlearner(m, data = sim1(), nfolds = 2)predict(sl, newdata = sim1(n = 5))predict(sl, newdata = sim1(n = 5), all.learners = TRUE)

targeted class object

Description

The functionsriskreg andatereturns an object of the typetargeted.

An object of class 'targeted' is a list with at least thefollowing components:

estimate: Anestimate object with the target parameterestimates (seeestimate.default)
opt: Object returned from the applied optimization routine
npar: number of parameters of the model (target and nuisance)
type: String describing the model

Value

objects of the S3 class 'targeted'

S3 generics

The following S3 generic functions are available for an object ofclasstargeted:

coef: Extract target coefficients of the estimated model.
vcov: Extract the variance-covariance matrix ofthe target parameters.
IC: Extract the estimated influence function.
print: Print estimates of the target parameters.
summary: Extract information on both target parameeters andestimated nuisance model.

Examples

## See example(riskreg) for examples

Extract model component fromdesign object

Description

Extract model component fromdesign object

Usage

## S3 method for class 'design'terms(x, specials, ...)

Arguments

x

design object

specials

extract variables marked as special(e.g., "offset", "weights", ...)

...

Additional arguments to lower level functions

Signed Wald intersection test

Description

Calculating test statistics and p-values for the signed Waldintersection test given by

SW = \inf_{\theta \in \cap_{i=1}^n H_i} \{(\widehat{\theta}-\theta)^\top W\widehat{\Sigma}W (\widehat{\theta}-\theta)\}

with individual hypotheses for eachcoordinate of\theta given byH_i: \theta_j < \delta_j for somenon-inferiority margin\delta_j,j=1,\ldots,n. #

Usage

test_intersection_sw(  par,  vcov,  noninf = 0,  weights = 1,  nsim.null = 10000,  index = NULL,  control = list(),  par.name = "theta")

Arguments

par

(numeric) parameter estimates orestimate object

vcov

(matrix) asymptotic variance estimate

noninf

(numeric) non-inferiority margins

weights

(numeric) optional weights

nsim.null

(integer) number of sample used in Monte-Carlo simulation

index

(integer) subset of parameters to test

control

(list) arguments to alternating projection algorithm. Seedetails section.

par.name

(character) parameter names in output

Details

The constrained least squares problem is solved using Dykstra'salgorithm. The following parameters for the optimization can be controlledvia thecontrol list argument:dykstra_niter sets the maximum number ofiterations (default 500),dykstra_tol convergence tolerance of thealternating projection algorithm (default 1e-7),pinv_tol tolerance forcalculating the pseudo-inverse matrix (defaultlength(par).Machine$double.epsmax(eigenvalue)).

Value

htest object

Author(s)

Klaus Kähler Holst, Christian Bressen Pipper

References

Christian Bressen Pipper, Andreas Nordland & Klaus Kähler Holst(2025) A general approach to construct powerful tests for intersections ofone-sided null-hypotheses based on influence functions. arXiv:https://arxiv.org/abs/2511.07096.

Examples

S <- matrix(c(1, 0.5, 0.5, 2), 2, 2)thetahat <- c(0.5, -0.2)test_intersection_sw(thetahat, S, nsim.null = 1e5)test_intersection_sw(thetahat, S, weights = NULL)## Not run: # only on 'lava' >= 1.8.2e <- estimate(coef = thetahat, vcov = S, labels = c("p1", "p2"))lava::closed_testing(e, test_intersection_sw, noninf = c(-0.1, -0.1)) |>  summary()## End(Not run)

One-sided Zmax test

Description

Calculating test statistics and p-values for theonesided Zmax / minP test.z

Given parameter estimates(\widehat{\theta}_1, \ldots, \widehat{\theta}_p)^\top with approximate assymptotic covariance matrix\widehat{S}, let Z_i = \frac{\widehat{\theta}_i - \delta_i}{\operatorname{SE}(\widehat{\theta}_i)} , where\operatorname{SE}(\widehat{\theta}_i) = \widehat{S}_{ii}. The Zmaxtest statistic is thenZ_{max} = \max \{Z_1,\ldots,Z_p\}, and thenull-hypothesis isH_0: \theta_i \leq \delta_i, i=1,\ldots,p withnon-inferiority margin\delta_i, i=1,\ldots,p, for which the p-valueis calculated as 1 - \Phi_R(Z_{max}) where\phi_R is the CDFof the multivariate normal distribution with mean zero and correlationmatrixR = \operatorname{diag}(S_{11}^{-0.5}, \ldots, S_{pp}^{-0.5})S\operatorname{diag}(S_{11}^{-0.5}, \ldots, S_{pp}^{-0.5}).

Usage

test_zmax_onesided(par, vcov, noninf = 0, index = NULL, par.name = "theta")

Arguments

par

(numeric) parameter estimates orestimate object

vcov

(matrix) asymptotic variance estimate

noninf

(numeric) non-inferiority margins

index

(integer) subset of parameters to test

par.name

(character) parameter names in output

Value

htest object

Author(s)

Christian Bressen Pipper, Klaus Kähler Holst

Scores truncated by death

Description

Simulated data inspired by the FLOW study (Perkovic 2024)...elt()The following variables are considered in this simulated dataset

time: time of first event in years (first major irreversible kidneyevent or non-related death)
status: event type at first major irreversible kidney event (=1),non-related death (=2), or right censoring (=0)
y: clinical outcome measurement (eGFR) at landmark time (:=2)
r: missing indicator fory (1 if observed, 0 if either t<2 or if theoutcome was not measured for other reasons)
a: binary treatment (1 := active, 0 := placebo)
x1: covariate, clinical outcome at baseline (eGFR)
x2: covariate, binary treatment usage indicator (1: SGLT2 treatment,0: none).

The actual failure times and censoring times are also included(failure.time,cens.time), and the full-data outcome (y0) given t>2.

Source

Simulated data

References

Perkovic, V., Tuttle, K. R., Rossing, P.,Mahaffey, K. W., Mann, J. F., Bakris, G., Baeres, F. M., Idorn, T.,Bosch-Traberg, H., Lausvig, N. L., and Pratley, R. (2024). Effects ofsemaglutide on chronic kidney disease in patients with type 2 diabetes. NewEngland Journal of Medicine, 391(2):109–121.

Examples

data(truncatedscore)

Extract ensemble weights

Description

Extract ensemble weights

Usage

## S3 method for class 'superlearner'weights(object, ...)

Arguments

object

(superlearner) Fitted model.

...

Not used.

Movatterモバイル変換

targeted: Targeted Inference

Description

Author(s)

References

See Also

Examples

ML model

Description

Usage

Arguments

Responder Average Treatment Effect

Description

Usage

Arguments

Value

Author(s)

Responder Average Treatment Effect

Description

Usage

Arguments

Details

Value

Author(s)

SuperLearner wrapper for learner

Description

Usage

Arguments

Value

Author(s)

AIPW estimator

Description

Usage

Arguments

Examples

Assumption Lean inference for generalized linear model parameters

Description

Usage

Arguments

Details

Value

Author(s)

Examples

AIPW (doubly-robust) estimator for Average Treatment Effect

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Calibration (training)

Description

Usage

Arguments

Details

Value

Author(s)

Examples

calibration class object

Description

Value

S3 generics

See Also

Examples

Conditional Average Treatment Effect estimation

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Conditional Relative Risk estimation

Description

Usage

Arguments

Value

Integral approximation of a time dependent function.Computes an approximation of`\int_start^stop S(t) dt`, where`S(t)` is a survival function, for a selection of start and stop timepoints.