Movatterモバイル変換

Type:

Package

Title:

Methods for Evaluating Principal Surrogates of TreatmentResponse

Version:

1.3.3

Date:

2025-04-08

Maintainer:

Michael C Sachs <sachsmc@gmail.com>

Description:

Contains the core methods for the evaluation of principal surrogates in a single clinical trial. Provides a flexible interface for defining models for the risk given treatment and the surrogate, the models for integration over the missing counterfactual surrogate responses, and the estimation methods. Estimated maximum likelihood and pseudo-score can be used for estimation, and the bootstrap for inference. A variety of post-estimation summary methods are provided, including print, summary, plot, and testing.

License:

MIT + file LICENSE

URL:

https://sachsmc.github.io/pseval/

BugReports:

https://github.com/sachsmc/pseval/issues/

Suggests:

ggplot2, testthat, knitr, printr, rmarkdown

Imports:

survival

VignetteBuilder:

knitr

RoxygenNote:

7.3.2

Encoding:

UTF-8

NeedsCompilation:

Packaged:

2025-04-08 07:59:28 UTC; micsac

Author:

Michael C Sachs [aut, cre], Erin E Gabriel [aut]

Repository:

CRAN

Date/Publication:

2025-04-08 10:20:02 UTC

Modify a psdesign object by adding on new components.

Description

This operator allows you to add objects to a psdesign object, such as integration models and risk models

Usage

## S3 method for class 'ps'p1 + p2

Arguments

p1

An object of classpsdesign

p2

Another object to be added top1, see list below for possible options

If the first object is an object of classpsdesign, you can addthe following types of objects, and it will return a modified psdesignobject. Users will generally add them in the order that they appear.

integration: Add or replace integration model
riskmodel: Add or replace risk model
estimate: Estimate parameters
bootstrap: Bootstrap estimates

Treatment efficacy contrast functions

Description

Treatment efficacy contrast functions

Usage

TE(R0, R1)

Arguments

R0

A vector of risks in the control arm

R1

A vector of risks in the treatment arm

Details

These functions take the risk in the two treatment arms, andcomputes a one-dimensional summary of those risks. Built-in choices are"TE" for treatment efficacy = 1 - risk_1(s)/risk_0(s),"RR" forrelative risk = risk_0(s)/risk_1(s),"logRR" for log of the relativerisk, and"RD" for the risk difference = risk_0(s) - risk_1(s).

Value

A vector the same length as R0 and R1.

Bootstrap resampling parameters

Description

Bootstrap resampling parameters

Usage

add_bootstrap(psdesign, bootstrap)

Arguments

psdesign

A psdesign object, it must have risk model and integration model components

bootstrap

A bootstrap object created byps_bootstrap

Examples

## Not run: test <- psdesign(generate_example_data(n = 100), Z = Z, Y = Y.obs, S = S.obs, BIP = BIP)est1 <- test + integrate_parametric(S.1 ~ BIP) + risk_binary() + ps_estimate(method = "BFGS")est1 + ps_bootstrap(method = "BFGS", start = est1$estimates$par, n.boots = 50)## End(Not run)

Estimate parameters

Description

Estimate parameters

Usage

add_estimate(psdesign, estimate)

Arguments

psdesign

A psdesign object, it must have risk model and integration model components

estimate

An estimate object created byps_estimate

Examples

test <- psdesign(generate_example_data(n = 100), Z = Z, Y = Y.obs, S = S.obs, BIP = BIP)test + integrate_parametric(S.1 ~ BIP) + risk_binary(D = 50) + ps_estimate(method = "BFGS")

Integration models

Description

Add integration model to a psdesign object

Usage

add_integration(psdesign, integration)

Arguments

psdesign

A psdesign object

integration

An integration object

Details

This is a list of the available integration models. The fundamental problem in surrogate evaluation is that there are unobserved values of the counterfactual surrogate responses S(1). In the estimated maximum likelihood framework, for subjects missing the S(1) values, we use an auxiliary pre-treatment variable or set of variables W that is observed for every subject to estimate the distribution of S(1) | W. Typically, this W is a BIP. Then for each missing S(1), we integrate likelihood contributions over each non-missing S(1) given their value of W, and average over the contributions.

integrate_parametric This is a parametric integration model that fits a linear model for the mean of S(1) | W and assumes a Gaussian distribution.
integrate_bivnorm This is another parametric integration model that assumes that S(1) and W are jointly normally distributed. The user must specify their mean, variances and correlation.
integrate_nonparametric This is a non-parametric integration model that is only valid for categorical S(1) and W. It uses the observed proportions to estimate the joint distribution of S(1), W.
integrate_semiparametric This is a semi-parametric model that uses the semi-parametric location scale model of Heagerty and Pepe (1999). Models are specified for the location of S(1) | W and the scale of S(1) | W. Then integrations are drawn from the empirical distribution of the residuals from that model, which are then transformed to the appropriate location and scale.

Examples

test <- psdesign(generate_example_data(n = 100), Z = Z, Y = Y.obs, S = S.obs, BIP = BIP)add_integration(test, integrate_parametric(S.1 ~ BIP))test + integrate_parametric(S.1 ~ BIP)  # same as above

Add risk model to a psdesign object

Description

Add risk model to a psdesign object

Usage

add_riskmodel(psdesign, riskmodel)

Arguments

psdesign

A psdesign object

riskmodel

A risk model object, from the list above

Details

The risk model component specifies the likelihood for the data. This involves specifying the distribution of the outcome variable, whether it is binary or time-to-event, and specifying how the surrogate S(1) and the treatment Z interact and affect the outcome. We use the formula notation to be consistent with other regression type models in R. Below is a list of available risk models.

risk_binary This is a generic risk model for binary outcomes. The user can specify the formula, and link function using eitherrisk.logit for the logistic link, orrisk.probit for the probit link. Custom link functions may also be specified, which take a single numeric vector argument, and returns a vector of corresponding probabilities.
risk_weibull This is a parameterization of the Weibull model for time-to-event outcomes that is consistent with that ofrweibull. The user specifies the formula for the linear predictor of the scale parameter.
risk_exponential This is a simple exponential model for a time-to-event outcome.
risk_poisson This is a Poisson model for count outcomes. It allows for offsets in the formula.
risk_continuous This is a Gaussian model for continuous outcomes. It assumes that larger values of the outcome are harmful (e.g. blood pressure)

Examples

test <- psdesign(generate_example_data(n = 100), Z = Z, Y = Y.obs, S = S.obs, BIP = BIP) +     integrate_parametric(S.1 ~ BIP)add_riskmodel(test, risk_binary())test + risk_binary() # same as above

Calculate the Standardized total gain

Description

Computes the standardized total gain for the risk difference. Optionally produces bootstrap standard errors and confidence intervals. The standardized total gain is the area between the risk difference curve and the horizontal line at the marginal risk difference. If the outcome is time to event then the STG is time-dependent, and a time point for evaluation is needed. If one is not provided then the restricted mean survival is estimated from the data and used.

Usage

calc_STG(  psdesign,  t,  sig.level = 0.05,  n.samps = 5000,  bootstraps = TRUE,  permute = TRUE,  permute.times = 2000,  progress.bar = TRUE)

Arguments

psdesign

A psdesign object. It must contain a risk model, anintegration model, and estimated parameters. Bootstrapped parameters areoptional

t

For time to event outcomes, a fixed timet may be provided tocompute the cumulative distribution function. If not, the restricted meansurvival time is used. Omit for binary outcomes.

sig.level

Significance level for bootstrap confidence intervals

n.samps

The number of samples to take over the range of S.1 at whichthe VE is calculated

bootstraps

If true, and bootstrapped estimates are present, willcalculate bootstrap standard errors and confidence interval.

permute

Not used, included for backwards compatibility

permute.times

Not used, included for backwards compatibility

progress.bar

Not used, included for backwards compatibility

Calculate the risk and functions of the risk

Description

Computes the treatment efficacy (TE) and other functions of the risk in eachtreatment arm over the range of surrogate values observed in the data. TE(s)is defined as 1 - risk(s, z = 1)/risk(s, z = 0), where z is the treatmentindicator. If any other variables are present in the risk model, then therisk is computed at their median value.

Usage

calc_risk(  psdesign,  contrast = "TE",  t,  sig.level = 0.05,  CI.type = "band",  n.samps = 5000,  bootstraps = TRUE,  newdata = NULL)

Arguments

psdesign

A psdesign object. It must contain a risk model, anintegration model, and estimated parameters. Bootstrapped parameters areoptional

contrast

The contrast function, or the name of the contrast function.See details.

t

For time to event outcomes, a fixed timet may be provided tocompute the cumulative distribution function. If not, the restricted meansurvival time is used. Omit for binary outcomes.

sig.level

Significance level for bootstrap confidence intervals

CI.type

Character string, "pointwise" for pointwise confidenceintervals, and "band" for simultaneous confidence band.

n.samps

The number of samples to take over the range of S.1 at whichthe contrast is calculated

bootstraps

If true, and bootstrapped estimates are present, willcalculate bootstrap standard errors and confidence bands.

newdata

Vector of S values. If present, will calculate the contrastfunction at values of newdata instead of the observed S.1

Details

The contrast function is a function that takes 2 inputs, the risk_0and risk_1, and returns some one dimensional function of those two inputs.It must be vectorized. Some built-in functions are"TE" for treatmentefficacy = 1 - risk_1(s)/risk_0(s),"RR" for relative risk =risk_1(s)/risk_0(s),"logRR" for log of the relative risk, and"RD" for the risk difference = risk_1(s) - risk_0(s).

Value

A data frame containing columns for the S values, the computedcontrast function at S, R0, and R1 at those S values, and optionallystandard errors and confidence intervals computed using bootstrappedestimates.

Examples

## Not run: # same result passing function name or functioncalc_risk(binary.boot, contrast = "TE", n.samps = 20)calc_risk(binary.boot, contrast = function(R0, R1) 1 - R1/R0, n.samps = 20)## End(Not run)

Compute the empirical Treatment Efficacy

Description

Compute the empirical Treatment Efficacy

Usage

empirical_TE(psdesign, t)

Arguments

psdesign

An object of classpsdesign

t

Fixed time for time to event outcomes to compute TE. If missing, uses restricted mean survival.

Compute the empirical Treatment Efficacy

Description

Included for backwards compatibility

Usage

empirical_VE(psdesign, t)

Arguments

psdesign

An object of classpsdesign

t

Fixed time for time to event outcomes to compute TE. If missing, uses restricted mean survival.

Expand augmented data using the integration function

Description

Expand augmented data using the integration function

Usage

expand_augdata(model, psdesign, D = 500)

Arguments

model

Formula defining the risk model

psdesign

An object of classpsdesign, that contains at least 1 integration model

D

The number of samples to take for the simulated annealing

Generate sample data used for testing

Description

Generate sample data used for testing

Usage

generate_example_data(n)

Arguments

n

Integer, the sample size

Bivariate normal integration models for the missing S(1)

Description

This model assumes that the pair [S(1), W] is bivariate normal, where W isthe BIP. The means, standard deviations, and correlation are estimated orfixed before calling this function. Then the conditional normal formula isapplied in order to get the distribution of S(1) | W. That distribution isused to integrate over the missing S(1) values. This method requires a BIP in thedesign.

Usage

integrate_bivnorm(x = "S.1", mu = c(0, 0), sd = c(1, 1), rho = 0.2)

Arguments

x

expression identifying the variable to be integrated. Typically this is S.1 or S.0

mu

means of the pair of surrogates, missing one first

sd

standard deviations of the pair, missing one first

rho

the correlation between X1 and X2

Nonparametric integration model for the missing S(1)

Description

Both S(1) and the BIP or set of BIPs must be categorical. This model integrates over the estimated distribution of S(1) | BIP

Usage

integrate_nonparametric(formula, ...)

Arguments

formula

Formula specifying the integration model for the surrogateunder treatment. Generally the candidate surrogate will be on the left sidein the formula, and the BIP or BIPs will be on the right side. In this casethe BIP and the S(1) must be categorical.

...

Not currently used

Parametric integration model for the missing S(1)

Description

Parametric integration model for the missing S(1)

Usage

integrate_parametric(formula, family = gaussian, ...)

Arguments

formula

Formula specifying the integration model for the surrogateunder treatment. Generally the candidate surrogate will be on the left sidein the formula, and the BIP or BIPs will be on the right side

family

Assumed distribution for the integration model. Must becompatible with thefamily argument ofglm. Currently onlyGaussian models are supported

...

Arguments passed toglm

Semiparametric integration model using the location-scale model

Description

Semiparametric integration model using the location-scale model

Usage

integrate_semiparametric(formula.location, formula.scale, ...)

Arguments

formula.location

Formula specifying the integration model for the location component of the surrogateunder treatment. Generally the candidate surrogate will be on the left sidein the formula, and the BIP or BIPs will be on the right side

formula.scale

Formula specifying the integration model for the scale component of the surrogateunder treatment. Generally the candidate surrogate will be on the left sidein the formula, and the BIP or BIPs will be on the right side

...

Other parameters passed tosp_locscale

Plot summary statistics for a psdesign object

Description

Plot the treatment efficacy or another contrast of risk versus S.1 for anestimated psdesign object

Usage

## S3 method for class 'psdesign'plot(  x,  t,  contrast = "TE",  sig.level = 0.05,  CI.type = "band",  n.samps = 500,  xlab = "S.1",  ylab = contrast,  col = 1,  lty = 1,  lwd = 1,  ...)

Arguments

x

Apsdesign object that contains a risk model, integrationmodel, and valid estimates

t

For time to event outcomes, a fixed timet may be provided tocompute the cumulative distribution function. If not, the restricted meansurvival time is used. Omit for binary outcomes.

contrast

Name of contrast function to plot."TE" or"VE" for treatment (vaccine)efficacy = 1 - risk_1(s)/risk_0(s),"RR" for relative risk =risk_1(s)/risk_0(s),"logRR" for log of the relative risk,"risk" for the risk in each treatment arm, and"RD" for therisk difference = risk_1(s) - risk_0(s). You can also pass a customfunction directly as long as it takes 2 vectors as input (risk0 and risk1)and returns 1 vector of the same length.

sig.level

Significance level used for confidence bands on the contrastcurve. This is only used if bootstrapped estimates are available.

CI.type

Character string, "pointwise" for pointwise confidenceintervals, and "band" for simultaneous confidence band.

n.samps

Number of samples to use over the range of S.1 for plottingthe curve

xlab

X-axis label

ylab

Y-axis label

col

Vector of integers specifying colors for each curve.

lty

Vector of integers specifying linetypes for each curve.

lwd

Vector of numeric values for line widths.

...

Other arguments passed toplot

Concisely print information about a psdesign object

Description

Concisely print information about a psdesign object

Usage

## S3 method for class 'psdesign'print(x, digits = 3, sig.level = 0.05, ...)

Arguments

x

An object of classpsdesign

digits

Number of significant digits to display

sig.level

Significance level to use for computing bootstrapped confidence intervals

...

Currently unused

Estimate parameters from a specified model using bootstrap resampling and estimated maximum likelihood

Description

Estimate parameters from a specified model using bootstrap resampling and estimated maximum likelihood

Usage

ps_bootstrap(  n.boots = 200,  progress.bar = TRUE,  start = NULL,  method = "BFGS",  control = list(),  ...)

Arguments

n.boots

Number of bootstrap replicates

progress.bar

Logical, if true will display a progress bar in the console

start

Vector of starting values, if NULL, will come up with starting values

method

Method to use for optimization, can be "pseudo-score" forcategorical S with nonparametric integration, or any of the methodsavailable inoptim. Defaults to "BFGS"

control

List of control parameters for passed tooptim

...

Arguments passed tooptim

Estimate parameters from a specified model using estimated maximum likelihood

Description

Estimate parameters from a specified model using estimated maximum likelihood

Usage

ps_estimate(start = NULL, method = "BFGS", control = list(), ...)

Arguments

start

Vector of starting values, if NULL, will come up with startingvalues

method

Method to use for optimization, can be "pseudo-score" forcategorical BIP, or any of the methodsavailable inoptim. Defaults to "BFGS"

control

List of control parameters for passed tooptim

...

Arguments passed tooptim orpseudo_score.

Specify a design for a principal surrogate evaluation

Description

Generate mappings that describe how variables in the data are mapped tocomponents of the principal surrogate analysis. Other thandata, thisis a list of key-value pairs describing the common elements of a ps analysis.The required keys are Z, Y, and S. Optional keys are BIP, CPV, BSM, andweights. These elements are described in details below. Additional keys-valuepairs can be included in.... This function generates an augmenteddataset and additional information for subsequent steps in the analysis. Inthe subsequent steps, refer to the variables by the keys. Seeadd_integration andadd_riskmodel for information on how toproceed in the analysis.

Usage

psdesign(  data,  Z,  Y,  S,  BIP = NULL,  CPV = NULL,  BSM = NULL,  weights = NULL,  tau,  ...)

Arguments

data

Data frame containing data to be analyzed

Z

Expression defining the treatment variable which has 2 levels

Y

Expression defining the outcome variable. For binary events thisshould be coded as 0/1 or a factor with 2 levels. For censoredtime-to-event outcomes this can be a call toSurv

S

Expression defining the candidate surrogate

BIP

Optional expression defining the baseline irrelevant predictor

CPV

Optional expression defining the closeout placebo vaccinationmeasurement

BSM

Optional expression defining the baseline surrogate measurement

weights

optional expression defining weights to accommodate nonrandomsubsampling, such as case control or two phase

tau

numeric, When the outcome Y is a survival time, it is possiblethat the surrogate was measured at some time tau after enrollment. Use theargument tau to specify the time when the surrogate was measured, in studytime. Not required for binary Y.

...

Other key-value pairs that will be included in the augmented data,e.g. additional candidate surrogates, covariates for adjustment, variablesused for integration

Estimate parameters from a specified model using pseudo-score

Description

Estimate parameters from a specified model using pseudo-score

Usage

pseudo_score(psdesign, start = NULL, epsilon = 1e-05, maxit = 50)

Arguments

psdesign

An object of class psdesign

start

Vector of starting values, if NULL, will come up with starting values

epsilon

Convergence criteria

maxit

Maximum number of iterations

Logit link function

Description

Logit link function

Usage

risk.logit(x)

Arguments

x

A vector of linear predictors

Value

A vector of probabilities

Probit link function

Description

Probit link function

Usage

risk.probit(x)

Arguments

x

A vector of linear predictors

Value

A vector of probabilities

Risk model for binary outcome

Description

Risk model for binary outcome

Usage

risk_binary(model = Y ~ S.1 * Z, D = 5000, risk = risk.logit)

Arguments

model

Formula specifying the risk model

D

number of samples for the simulated annealing integration

risk

Function for transforming a linear predictor into a probability.E.g., risk.logit for the logistic model, risk.probit for the probit model

Risk model for continuous outcome

Description

This model assumes that the outcome Y is normally distributed conditional onS.1 and Z, with mean determined by the model formula. It also assumes thatlarger values of Y are more indicative of poor outcomes, e.g., bloodpressure.

Usage

risk_continuous(model = Y ~ S.1 * Z, D = 5000)

Arguments

model

Formula specifying the risk model for the mean

D

number of samples for the simulated annealing integration

Exponential risk model for time to event outcome

Description

Exponential risk model for time to event outcome

Usage

risk_exponential(model = Y ~ S.1 * Z, D = 5000)

Arguments

model

Formula specifying the risk model. The outcome should be aSurv object specifying right censoring

D

number of samples for simulated annealing

Poisson risk model for count outcomes

Description

Poisson risk model for count outcomes

Usage

risk_poisson(model = Y ~ S.1 * Z, D = 5000)

Arguments

model

Formula specifying the risk model. The outcome should be an integer of counts. This right side of the formula may contain anoffset term.

D

number of samples for simulated annealing

Weibull risk model for time to event outcome

Description

Weibull risk model for time to event outcome

Usage

risk_weibull(model = Y ~ S.1 * Z, D = 5000)

Arguments

model

Formula specifying the risk model. The outcome should be aSurv object specifying right censoring

D

number of samples for simulated annealing

Calculate risks with handlers for survival data

Description

Calculate risks with handlers for survival data

Usage

riskcalc(risk.function, Y, par, t, dat0, dat1)

Arguments

risk.function

Function taking three arguments, a data.frame, parameters, and time. It should return a vector the same number of rows as the data frame

Y

The outcome variable

par

the vector of parameter values

t

Time for a survival outcome, may be missing

dat0

Data frame containing S and Z = 1

dat1

Data frame containing S and Z = 0

Fit the semi-parametric location-scale model

Description

This estimates the location-scale model as described in Heagerty and Pepe (1999) using the Newton-Raphson method. The location and scale formulas must have the same outcome, but they may have different predictors.

Usage

sp_locscale(  formula.location,  formula.scale,  data,  weights,  tol = 1e-06,  maxit = 100)

Arguments

formula.location

Formula specifying the model for the location

formula.scale

Formula specifying the model for the scale

data

Data used to estimate the model

weights

Weights applied to the estimating equations

tol

Convergence tolerance

maxit

Maximum number of iterations

Value

A list containing the parameter estimates, the convergence indicator, and residuals

Compute the standardized total gain

Description

Compute the standardized total gain

Usage

stg(R1, R0, stand = TRUE)

Arguments

R1

Risk in the treatment group

R0

Risk in the control group

stand

Standardize?

Summarize bootstrap samples

Description

Summarize bootstrap samples

Usage

summarize_bs(bootdf, estdf = NULL, sig.level = 0.05, CI.type = "band")

Arguments

bootdf

Data frame containing bootstrapped estimates, with a column containing a convergence indicator

estdf

Data frame containing full sample estimate

sig.level

Significance level to use for confidence intervals

CI.type

Character string, "pointwise" for pointwise confidence intervals, and "band" for simultaneous confidence band.

Summary method for psdesign objects

Description

Summary method for psdesign objects

Usage

## S3 method for class 'psdesign'summary(object, digits = 3, sig.level = 0.05, ...)

Arguments

object

An object of classpsdesign

digits

Number of significant digits to display

sig.level

Significance level to use for computing bootstrapped confidence intervals

...

Currently unused

Value

Invisibly returns the printed table, along with the three estimates of vaccine efficacy. The empirical TE is 1 minus the relative risk comparing the treatment arm to the control arm. The risk is estimated as the proportion in the binary outcome case, or with the Kaplan-Meier estimate at the restricted mean survival in the time-to-event case. The marginal TE estimate is the TE estimate under the specified parametric risk model, ignoring the effect of S.1. The model based average TE is the TE estimate from the specified risk model, averaged over the distribution of S.1. The point of displaying these three is to assess the validity of the parametric model, and to assess the validity of the model estimation. Wild differences among these estimates may indicate problems with the model or convergence.

Check that a variable is suitable for using as binary treatment indicator

Description

Checks for two classes and gives a warning message indicating which level is assumed to be 0/1

Usage

verify_trt(D)

Arguments

D

Vector that will be checked for 2-class labels

Test for wide effect modification

Description

This runs a multivariate Wald test on the interaction terms of the model, using the bootstrap covariance

Usage

wem_test(x)

Arguments

x

An object of classpsdesign with bootstrap replicates